You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

4135 lines
137 KiB

26 years ago
26 years ago
26 years ago
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
20 years ago
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
20 years ago
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
19 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
26 years ago
26 years ago
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
16 years ago
26 years ago
26 years ago
21 years ago
26 years ago
21 years ago
26 years ago
26 years ago
26 years ago
21 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
26 years ago
21 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
26 years ago
21 years ago
21 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
19 years ago
19 years ago
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
WL#4738 streamline/simplify @@variable creation process Bug#16565 mysqld --help --verbose does not order variablesBug#20413 sql_slave_skip_counter is not shown in show variables Bug#20415 Output of mysqld --help --verbose is incomplete Bug#25430 variable not found in SELECT @@global.ft_max_word_len; Bug#32902 plugin variables don't know their names Bug#34599 MySQLD Option and Variable Reference need to be consistent in formatting! Bug#34829 No default value for variable and setting default does not raise error Bug#34834 ? Is accepted as a valid sql mode Bug#34878 Few variables have default value according to documentation but error occurs Bug#34883 ft_boolean_syntax cant be assigned from user variable to global var. Bug#37187 `INFORMATION_SCHEMA`.`GLOBAL_VARIABLES`: inconsistent status Bug#40988 log_output_basic.test succeeded though syntactically false. Bug#41010 enum-style command-line options are not honoured (maria.maria-recover fails) Bug#42103 Setting key_buffer_size to a negative value may lead to very large allocations Bug#44691 Some plugins configured as MYSQL_PLUGIN_MANDATORY in can be disabled Bug#44797 plugins w/o command-line options have no disabling option in --help Bug#46314 string system variables don't support expressions Bug#46470 sys_vars.max_binlog_cache_size_basic_32 is broken Bug#46586 When using the plugin interface the type "set" for options caused a crash. Bug#47212 Crash in DBUG_PRINT in mysqltest.cc when trying to print octal number Bug#48758 mysqltest crashes on sys_vars.collation_server_basic in gcov builds Bug#49417 some complaints about mysqld --help --verbose output Bug#49540 DEFAULT value of binlog_format isn't the default value Bug#49640 ambiguous option '--skip-skip-myisam' (double skip prefix) Bug#49644 init_connect and \0 Bug#49645 init_slave and multi-byte characters Bug#49646 mysql --show-warnings crashes when server dies
16 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
26 years ago
26 years ago
WL#4738 streamline/simplify @@variable creation process Bug#16565 mysqld --help --verbose does not order variablesBug#20413 sql_slave_skip_counter is not shown in show variables Bug#20415 Output of mysqld --help --verbose is incomplete Bug#25430 variable not found in SELECT @@global.ft_max_word_len; Bug#32902 plugin variables don't know their names Bug#34599 MySQLD Option and Variable Reference need to be consistent in formatting! Bug#34829 No default value for variable and setting default does not raise error Bug#34834 ? Is accepted as a valid sql mode Bug#34878 Few variables have default value according to documentation but error occurs Bug#34883 ft_boolean_syntax cant be assigned from user variable to global var. Bug#37187 `INFORMATION_SCHEMA`.`GLOBAL_VARIABLES`: inconsistent status Bug#40988 log_output_basic.test succeeded though syntactically false. Bug#41010 enum-style command-line options are not honoured (maria.maria-recover fails) Bug#42103 Setting key_buffer_size to a negative value may lead to very large allocations Bug#44691 Some plugins configured as MYSQL_PLUGIN_MANDATORY in can be disabled Bug#44797 plugins w/o command-line options have no disabling option in --help Bug#46314 string system variables don't support expressions Bug#46470 sys_vars.max_binlog_cache_size_basic_32 is broken Bug#46586 When using the plugin interface the type "set" for options caused a crash. Bug#47212 Crash in DBUG_PRINT in mysqltest.cc when trying to print octal number Bug#48758 mysqltest crashes on sys_vars.collation_server_basic in gcov builds Bug#49417 some complaints about mysqld --help --verbose output Bug#49540 DEFAULT value of binlog_format isn't the default value Bug#49640 ambiguous option '--skip-skip-myisam' (double skip prefix) Bug#49644 init_connect and \0 Bug#49645 init_slave and multi-byte characters Bug#49646 mysql --show-warnings crashes when server dies
16 years ago
19 years ago
21 years ago
26 years ago
Initial import of WL#3726 "DDL locking for all metadata objects". Backport of: ------------------------------------------------------------ revno: 2630.4.1 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Fri 2008-05-23 17:54:03 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ------------------------------------------------------------ This is the first patch in series. It transforms the metadata locking subsystem to use a dedicated module (mdl.h,cc). No significant changes in the locking protocol. The import passes the test suite with the exception of deprecated/removed 6.0 features, and MERGE tables. The latter are subject to a fix by WL#4144. Unfortunately, the original changeset comments got lost in a merge, thus this import has its own (largely insufficient) comments. This patch fixes Bug#25144 "replication / binlog with view breaks". Warning: this patch introduces an incompatible change: Under LOCK TABLES, it's no longer possible to FLUSH a table that was not locked for WRITE. Under LOCK TABLES, it's no longer possible to DROP a table or VIEW that was not locked for WRITE. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.2 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:03:45 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.3 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:08:51 +0400 message: WL#3726 "DDL locking for all metadata objects" Fixed failing Windows builds by adding mdl.cc to the lists of files needed to build server/libmysqld on Windows. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.4 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 21:57:58 +0400 message: WL#3726 "DDL locking for all metadata objects". Fix for assert failures in kill.test which occured when one tried to kill ALTER TABLE statement on merge table while it was waiting in wait_while_table_is_used() for other connections to close this table. These assert failures stemmed from the fact that cleanup code in this case assumed that temporary table representing new version of table was open with adding to THD::temporary_tables list while code which were opening this temporary table wasn't always fulfilling this. This patch changes code that opens new version of table to always do this linking in. It also streamlines cleanup process for cases when error occurs while we have new version of table open. ****** WL#3726 "DDL locking for all metadata objects" Add libmysqld/mdl.cc to .bzrignore. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.6 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sun 2008-05-25 00:33:22 +0400 message: WL#3726 "DDL locking for all metadata objects". Addition to the fix of assert failures in kill.test caused by changes for this worklog. Make sure we close the new table only once.
16 years ago
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
19 years ago
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
19 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
21 years ago
26 years ago
26 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
19 years ago
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
19 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
19 years ago
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
20 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
16 years ago
26 years ago
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
19 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
20 years ago
26 years ago
WL#4738 streamline/simplify @@variable creation process Bug#16565 mysqld --help --verbose does not order variablesBug#20413 sql_slave_skip_counter is not shown in show variables Bug#20415 Output of mysqld --help --verbose is incomplete Bug#25430 variable not found in SELECT @@global.ft_max_word_len; Bug#32902 plugin variables don't know their names Bug#34599 MySQLD Option and Variable Reference need to be consistent in formatting! Bug#34829 No default value for variable and setting default does not raise error Bug#34834 ? Is accepted as a valid sql mode Bug#34878 Few variables have default value according to documentation but error occurs Bug#34883 ft_boolean_syntax cant be assigned from user variable to global var. Bug#37187 `INFORMATION_SCHEMA`.`GLOBAL_VARIABLES`: inconsistent status Bug#40988 log_output_basic.test succeeded though syntactically false. Bug#41010 enum-style command-line options are not honoured (maria.maria-recover fails) Bug#42103 Setting key_buffer_size to a negative value may lead to very large allocations Bug#44691 Some plugins configured as MYSQL_PLUGIN_MANDATORY in can be disabled Bug#44797 plugins w/o command-line options have no disabling option in --help Bug#46314 string system variables don't support expressions Bug#46470 sys_vars.max_binlog_cache_size_basic_32 is broken Bug#46586 When using the plugin interface the type "set" for options caused a crash. Bug#47212 Crash in DBUG_PRINT in mysqltest.cc when trying to print octal number Bug#48758 mysqltest crashes on sys_vars.collation_server_basic in gcov builds Bug#49417 some complaints about mysqld --help --verbose output Bug#49540 DEFAULT value of binlog_format isn't the default value Bug#49640 ambiguous option '--skip-skip-myisam' (double skip prefix) Bug#49644 init_connect and \0 Bug#49645 init_slave and multi-byte characters Bug#49646 mysql --show-warnings crashes when server dies
16 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
21 years ago
20 years ago
20 years ago
20 years ago
22 years ago
20 years ago
22 years ago
20 years ago
21 years ago
21 years ago
21 years ago
21 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
26 years ago
26 years ago
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
19 years ago
26 years ago
26 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
26 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
26 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
26 years ago
26 years ago
26 years ago
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
19 years ago
26 years ago
26 years ago
21 years ago
21 years ago
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
19 years ago
22 years ago
22 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
19 years ago
26 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
26 years ago
26 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
19 years ago
26 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
26 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
21 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
26 years ago
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
16 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
17 years ago
26 years ago
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
17 years ago
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
17 years ago
Bug #45225 Locking: hang if drop table with no timeout This patch introduces timeouts for metadata locks. The timeout is specified in seconds using the new dynamic system variable "lock_wait_timeout" which has both GLOBAL and SESSION scopes. Allowed values range from 1 to 31536000 seconds (= 1 year). The default value is 1 year. The new server parameter "lock-wait-timeout" can be used to set the default value parameter upon server startup. "lock_wait_timeout" applies to all statements that use metadata locks. These include DML and DDL operations on tables, views, stored procedures and stored functions. They also include LOCK TABLES, FLUSH TABLES WITH READ LOCK and HANDLER statements. The patch also changes thr_lock.c code (table data locks used by MyISAM and other simplistic engines) to use the same system variable. InnoDB row locks are unaffected. One exception to the handling of the "lock_wait_timeout" variable is delayed inserts. All delayed inserts are executed with a timeout of 1 year regardless of the setting for the global variable. As the connection issuing the delayed insert gets no notification of delayed insert timeouts, we want to avoid unnecessary timeouts. It's important to note that the timeout value is used for each lock acquired and that one statement can take more than one lock. A statement can therefore block for longer than the lock_wait_timeout value before reporting a timeout error. When lock timeout occurs, ER_LOCK_WAIT_TIMEOUT is reported. Test case added to lock_multi.test.
16 years ago
26 years ago
26 years ago
26 years ago
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
17 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
16 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
26 years ago
26 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
26 years ago
A fix and a test case for Bug#19022 "Memory bug when switching db during trigger execution" Bug#17199 "Problem when view calls function from another database." Bug#18444 "Fully qualified stored function names don't work correctly in SELECT statements" Documentation note: this patch introduces a change in behaviour of prepared statements. This patch adds a few new invariants with regard to how THD::db should be used. These invariants should be preserved in future: - one should never refer to THD::db by pointer and always make a deep copy (strmake, strdup) - one should never compare two databases by pointer, but use strncmp or my_strncasecmp - TABLE_LIST object table->db should be always initialized in the parser or by creator of the object. For prepared statements it means that if the current database is changed after a statement is prepared, the database that was current at prepare remains active. This also means that you can not prepare a statement that implicitly refers to the current database if the latter is not set. This is not documented, and therefore needs documentation. This is NOT a change in behavior for almost all SQL statements except: - ALTER TABLE t1 RENAME t2 - OPTIMIZE TABLE t1 - ANALYZE TABLE t1 - TRUNCATE TABLE t1 -- until this patch t1 or t2 could be evaluated at the first execution of prepared statement. CURRENT_DATABASE() still works OK and is evaluated at every execution of prepared statement. Note, that in stored routines this is not an issue as the default database is the database of the stored procedure and "use" statement is prohibited in stored routines. This patch makes obsolete the use of check_db_used (it was never used in the old code too) and all other places that check for table->db and assign it from THD::db if it's NULL, except the parser. How this patch was created: THD::{db,db_length} were replaced with a LEX_STRING, THD::db. All the places that refer to THD::{db,db_length} were manually checked and: - if the place uses thd->db by pointer, it was fixed to make a deep copy - if a place compared two db pointers, it was fixed to compare them by value (via strcmp/my_strcasecmp, whatever was approproate) Then this intermediate patch was used to write a smaller patch that does the same thing but without a rename. TODO in 5.1: - remove check_db_used - deploy THD::set_db in mysql_change_db See also comments to individual files.
20 years ago
26 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
16 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
16 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
16 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
26 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
26 years ago
26 years ago
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
19 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
26 years ago
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
16 years ago
26 years ago
26 years ago
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
20 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
20 years ago
26 years ago
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
20 years ago
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
19 years ago
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
20 years ago
26 years ago
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
20 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
20 years ago
26 years ago
Bug#16218 - Crash on insert delayed Bug#17294 - INSERT DELAYED puting an \n before data Bug#16611 - INSERT DELAYED corrupts data Bug#13707 - Server crash with INSERT DELAYED on MyISAM table Combined as Bug#16218. INSERT DELAYED crashed in 5.0 on a table with a varchar that could be NULL and was created pre-5.0 (Bugs 16218 and 13707). INSERT DELAYED corrupted data in 5.0 on a table with varchar fields that was created pre-5.0 (Bugs 17294 and 16611). In case of INSERT DELAYED the open table is copied from the delayed insert thread to be able to create a record for the queue. When copying the fields, a method was used that did convert old varchar to new varchar fields and did not set up some pointers into the record buffer of the table. The field conversion was guilty for the misinterpretation of the record contents by the delayed insert thread. The wrong pointer setup was guilty for the crashes. For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table) I fixed the above mentioned method to set up one of the pointers. For Bug 16218 I set up the other pointers too. But when looking at the corruptions I got aware that converting the field type was totally wrong for INSERT DELAYED. The copied table is used to create a record that is to be sent to the delayed insert thread. Of course it can interpret the record correctly only if all field types are the same in both table objects. So I revoked the fix for Bug 13707 and changed the new_field() method so that it can suppress conversions. No test case as this is a migration problem. One needs to create a table with 4.x and use it with 5.x. I added two test scripts to the bug report.
20 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
26 years ago
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
19 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
26 years ago
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
16 years ago
26 years ago
26 years ago
26 years ago
26 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
A fix and a test case for Bug#21483 "Server abort or deadlock on INSERT DELAYED with another implicit insert" Also fixes and adds test cases for bugs: 20497 "Trigger with INSERT DELAYED causes Error 1165" 21714 "Wrong NEW.value and server abort on INSERT DELAYED to a table with a trigger". Post-review fixes. Problem: In MySQL INSERT DELAYED is a way to pipe all inserts into a given table through a dedicated thread. This is necessary for simplistic storage engines like MyISAM, which do not have internal concurrency control or threading and thus can not achieve efficient INSERT throughput without support from SQL layer. DELAYED INSERT works as follows: For every distinct table, which can accept DELAYED inserts and has pending data to insert, a dedicated thread is created to write data to disk. All user connection threads that attempt to delayed-insert into this table interact with the dedicated thread in producer/consumer fashion: all records to-be inserted are pushed into a queue of the dedicated thread, which fetches the records and writes them. In this design, client connection threads never open or lock the delayed insert table. This functionality was introduced in version 3.23 and does not take into account existence of triggers, views, or pre-locking. E.g. if INSERT DELAYED is called from a stored function, which, in turn, is called from another stored function that uses the delayed table, a deadlock can occur, because delayed locking by-passes pre-locking. Besides: * the delayed thread works directly with the subject table through the storage engine API and does not invoke triggers * even if it was patched to invoke triggers, if triggers, in turn, used other tables, the delayed thread would have to open and lock involved tables (use pre-locking). * even if it was patched to use pre-locking, without deadlock detection the delayed thread could easily lock out user connection threads in case when the same table is used both in a trigger and on the right side of the insert query: the delayed thread would not release locks until all inserts are complete, and user connection can not complete inserts without having locks on the tables used on the right side of the query. Solution: These considerations suggest two general alternatives for the future of INSERT DELAYED: * it is considered a full-fledged alternative to normal INSERT * it is regarded as an optimisation that is only relevant for simplistic engines. Since we missed our chance to provide complete support of new features when 5.0 was in development, the first alternative currently renders infeasible. However, even the second alternative, which is to detect new features and convert DELAYED insert into a normal insert, is not easy to implement. The catch-22 is that we don't know if the subject table has triggers or is a view before we open it, and we only open it in the delayed thread. We don't know if the query involves pre-locking until we have opened all tables, and we always first create the delayed thread, and only then open the remaining tables. This patch detects the problematic scenarios and converts DELAYED INSERT to a normal INSERT using the following approach: * if the statement is executed under pre-locking (e.g. from within a stored function or trigger) or the right side may require pre-locking, we detect the situation before creating a delayed insert thread and convert the statement to a conventional INSERT. * if the subject table is a view or has triggers, we shutdown the delayed thread and convert the statement to a conventional INSERT.
19 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
16 years ago
Initial import of WL#3726 "DDL locking for all metadata objects". Backport of: ------------------------------------------------------------ revno: 2630.4.1 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Fri 2008-05-23 17:54:03 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ------------------------------------------------------------ This is the first patch in series. It transforms the metadata locking subsystem to use a dedicated module (mdl.h,cc). No significant changes in the locking protocol. The import passes the test suite with the exception of deprecated/removed 6.0 features, and MERGE tables. The latter are subject to a fix by WL#4144. Unfortunately, the original changeset comments got lost in a merge, thus this import has its own (largely insufficient) comments. This patch fixes Bug#25144 "replication / binlog with view breaks". Warning: this patch introduces an incompatible change: Under LOCK TABLES, it's no longer possible to FLUSH a table that was not locked for WRITE. Under LOCK TABLES, it's no longer possible to DROP a table or VIEW that was not locked for WRITE. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.2 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:03:45 +0400 message: WL#3726 "DDL locking for all metadata objects". After review fixes in progress. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.3 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 14:08:51 +0400 message: WL#3726 "DDL locking for all metadata objects" Fixed failing Windows builds by adding mdl.cc to the lists of files needed to build server/libmysqld on Windows. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.4 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sat 2008-05-24 21:57:58 +0400 message: WL#3726 "DDL locking for all metadata objects". Fix for assert failures in kill.test which occured when one tried to kill ALTER TABLE statement on merge table while it was waiting in wait_while_table_is_used() for other connections to close this table. These assert failures stemmed from the fact that cleanup code in this case assumed that temporary table representing new version of table was open with adding to THD::temporary_tables list while code which were opening this temporary table wasn't always fulfilling this. This patch changes code that opens new version of table to always do this linking in. It also streamlines cleanup process for cases when error occurs while we have new version of table open. ****** WL#3726 "DDL locking for all metadata objects" Add libmysqld/mdl.cc to .bzrignore. ****** Backport of: ------------------------------------------------------------ revno: 2630.4.6 committer: Dmitry Lenev <dlenev@mysql.com> branch nick: mysql-6.0-3726-w timestamp: Sun 2008-05-25 00:33:22 +0400 message: WL#3726 "DDL locking for all metadata objects". Addition to the fix of assert failures in kill.test caused by changes for this worklog. Make sure we close the new table only once.
16 years ago
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
16 years ago
Backport of revno: 2617.69.34 Bug #45949 Assertion `!tables->table' in open_tables() on ALTER + INSERT DELAYED The assertion was caused by improperly closing tables when INSERT DELAYED needed to reopen tables. This patch replaces the call to close_thread_tables with close_tables_for_reopen which fixes the problem. The only way I was able to trigger the reopen code path and thus the assertion, was if ALTER TABLE killed the delayed insert thread and the delayed insert thread was able to enter the reopen code path before it noticed that thd->killed had been set. Note that in these cases reopen will always fail since open_table() will check thd->killed and return. This patch therefore adds two more thd->killed checks to minimize the chance of entering the reopen code path without hope for success. The patch also changes it so that if the delayed insert is killed using KILL_CONNECTION, the error message that is copied to the connection thread is ER_QUERY_INTERRUPTED rather than ER_SERVER_SHUTDOWN. This means that if INSERT DELAYED fails, the user will now see "Query execution was interrupted" rather than the misleading "Server shutdown in progress". No test case is supplied. This is for two reasons: 1) Unable to reproduce the error without having the delayed insert thread in a killed state which means that reopen is futile and was not supposed to be attempted. 2) Difficulty of using sync points in other threads than the connection thread. The patch has been successfully tested with the RQG and the grammar supplied in the bug description.
16 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
19 years ago
26 years ago
Bug#34043: Server loops excessively in _checkchunk() when safemalloc is enabled Essentially, the problem is that safemalloc is excruciatingly slow as it checks all allocated blocks for overrun at each memory management primitive, yielding a almost exponential slowdown for the memory management functions (malloc, realloc, free). The overrun check basically consists of verifying some bytes of a block for certain magic keys, which catches some simple forms of overrun. Another minor problem is violation of aliasing rules and that its own internal list of blocks is prone to corruption. Another issue with safemalloc is rather the maintenance cost as the tool has a significant impact on the server code. Given the magnitude of memory debuggers available nowadays, especially those that are provided with the platform malloc implementation, maintenance of a in-house and largely obsolete memory debugger becomes a burden that is not worth the effort due to its slowness and lack of support for detecting more common forms of heap corruption. Since there are third-party tools that can provide the same functionality at a lower or comparable performance cost, the solution is to simply remove safemalloc. Third-party tools can provide the same functionality at a lower or comparable performance cost. The removal of safemalloc also allows a simplification of the malloc wrappers, removing quite a bit of kludge: redefinition of my_malloc, my_free and the removal of the unused second argument of my_free. Since free() always check whether the supplied pointer is null, redudant checks are also removed. Also, this patch adds unit testing for my_malloc and moves my_realloc implementation into the same file as the other memory allocation primitives.
16 years ago
26 years ago
26 years ago
26 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
Bug #45225 Locking: hang if drop table with no timeout This patch introduces timeouts for metadata locks. The timeout is specified in seconds using the new dynamic system variable "lock_wait_timeout" which has both GLOBAL and SESSION scopes. Allowed values range from 1 to 31536000 seconds (= 1 year). The default value is 1 year. The new server parameter "lock-wait-timeout" can be used to set the default value parameter upon server startup. "lock_wait_timeout" applies to all statements that use metadata locks. These include DML and DDL operations on tables, views, stored procedures and stored functions. They also include LOCK TABLES, FLUSH TABLES WITH READ LOCK and HANDLER statements. The patch also changes thr_lock.c code (table data locks used by MyISAM and other simplistic engines) to use the same system variable. InnoDB row locks are unaffected. One exception to the handling of the "lock_wait_timeout" variable is delayed inserts. All delayed inserts are executed with a timeout of 1 year regardless of the setting for the global variable. As the connection issuing the delayed insert gets no notification of delayed insert timeouts, we want to avoid unnecessary timeouts. It's important to note that the timeout value is used for each lock acquired and that one statement can take more than one lock. A statement can therefore block for longer than the lock_wait_timeout value before reporting a timeout error. When lock timeout occurs, ER_LOCK_WAIT_TIMEOUT is reported. Test case added to lock_multi.test.
16 years ago
26 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
26 years ago
26 years ago
26 years ago
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
19 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
26 years ago
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
20 years ago
26 years ago
25 years ago
26 years ago
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
20 years ago
20 years ago
20 years ago
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
16 years ago
20 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
26 years ago
Bug #45225 Locking: hang if drop table with no timeout This patch introduces timeouts for metadata locks. The timeout is specified in seconds using the new dynamic system variable "lock_wait_timeout" which has both GLOBAL and SESSION scopes. Allowed values range from 1 to 31536000 seconds (= 1 year). The default value is 1 year. The new server parameter "lock-wait-timeout" can be used to set the default value parameter upon server startup. "lock_wait_timeout" applies to all statements that use metadata locks. These include DML and DDL operations on tables, views, stored procedures and stored functions. They also include LOCK TABLES, FLUSH TABLES WITH READ LOCK and HANDLER statements. The patch also changes thr_lock.c code (table data locks used by MyISAM and other simplistic engines) to use the same system variable. InnoDB row locks are unaffected. One exception to the handling of the "lock_wait_timeout" variable is delayed inserts. All delayed inserts are executed with a timeout of 1 year regardless of the setting for the global variable. As the connection issuing the delayed insert gets no notification of delayed insert timeouts, we want to avoid unnecessary timeouts. It's important to note that the timeout value is used for each lock acquired and that one statement can take more than one lock. A statement can therefore block for longer than the lock_wait_timeout value before reporting a timeout error. When lock timeout occurs, ER_LOCK_WAIT_TIMEOUT is reported. Test case added to lock_multi.test.
16 years ago
26 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
26 years ago
26 years ago
Prevent bugs by making DBUG_* expressions syntactically equivalent to a single statement. --- Bug#24795: SHOW PROFILE Profiling is only partially functional on some architectures. Where there is no getrusage() system call, presently Null values are returned where it would be required. Notably, Windows needs some love applied to make it as useful. Syntax this adds: SHOW PROFILES SHOW PROFILE [types] [FOR QUERY n] [OFFSET n] [LIMIT n] where "n" is an integer and "types" is zero or many (comma-separated) of "CPU" "MEMORY" (not presently supported) "BLOCK IO" "CONTEXT SWITCHES" "PAGE FAULTS" "IPC" "SWAPS" "SOURCE" "ALL" It also adds a session variable (boolean) "profiling", set to "no" by default, and (integer) profiling_history_size, set to 15 by default. This patch abstracts setting THDs' "proc_info" behind a macro that can be used as a hook into the profiling code when profiling support is compiled in. All future code in this line should use that mechanism for setting thd->proc_info. --- Tests are now set to omit the statistics. --- Adds an Information_schema table, "profiling" for access to "show profile" data. --- Merge zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community-3--bug24795 into zippy.cornsilk.net:/home/cmiller/work/mysql/mysql-5.0-community --- Fix merge problems. --- Fixed one bug in the query_source being NULL. Updated test results. --- Include more thorough profiling tests. Improve support for prepared statements. Use session-specific query IDs, starting at zero. --- Selecting from I_S.profiling is no longer quashed in profiling, as requested by Giuseppe. Limit the size of captured query text. No longer log queries that are zero length.
19 years ago
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
16 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
21 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
21 years ago
26 years ago
26 years ago
26 years ago
19 years ago
19 years ago
WL#3817: Simplify string / memory area types and make things more consistent (first part) The following type conversions was done: - Changed byte to uchar - Changed gptr to uchar* - Change my_string to char * - Change my_size_t to size_t - Change size_s to size_t Removed declaration of byte, gptr, my_string, my_size_t and size_s. Following function parameter changes was done: - All string functions in mysys/strings was changed to use size_t instead of uint for string lengths. - All read()/write() functions changed to use size_t (including vio). - All protocoll functions changed to use size_t instead of uint - Functions that used a pointer to a string length was changed to use size_t* - Changed malloc(), free() and related functions from using gptr to use void * as this requires fewer casts in the code and is more in line with how the standard functions work. - Added extra length argument to dirname_part() to return the length of the created string. - Changed (at least) following functions to take uchar* as argument: - db_dump() - my_net_write() - net_write_command() - net_store_data() - DBUG_DUMP() - decimal2bin() & bin2decimal() - Changed my_compress() and my_uncompress() to use size_t. Changed one argument to my_uncompress() from a pointer to a value as we only return one value (makes function easier to use). - Changed type of 'pack_data' argument to packfrm() to avoid casts. - Changed in readfrm() and writefrom(), ha_discover and handler::discover() the type for argument 'frmdata' to uchar** to avoid casts. - Changed most Field functions to use uchar* instead of char* (reduced a lot of casts). - Changed field->val_xxx(xxx, new_ptr) to take const pointers. Other changes: - Removed a lot of not needed casts - Added a few new cast required by other changes - Added some cast to my_multi_malloc() arguments for safety (as string lengths needs to be uint, not size_t). - Fixed all calls to hash-get-key functions to use size_t*. (Needed to be done explicitely as this conflict was often hided by casting the function to hash_get_key). - Changed some buffers to memory regions to uchar* to avoid casts. - Changed some string lengths from uint to size_t. - Changed field->ptr to be uchar* instead of char*. This allowed us to get rid of a lot of casts. - Some changes from true -> TRUE, false -> FALSE, unsigned char -> uchar - Include zlib.h in some files as we needed declaration of crc32() - Changed MY_FILE_ERROR to be (size_t) -1. - Changed many variables to hold the result of my_read() / my_write() to be size_t. This was needed to properly detect errors (which are returned as (size_t) -1). - Removed some very old VMS code - Changed packfrm()/unpackfrm() to not be depending on uint size (portability fix) - Removed windows specific code to restore cursor position as this causes slowdown on windows and we should not mix read() and pread() calls anyway as this is not thread safe. Updated function comment to reflect this. Changed function that depended on original behavior of my_pwrite() to itself restore the cursor position (one such case). - Added some missing checking of return value of malloc(). - Changed definition of MOD_PAD_CHAR_TO_FULL_LENGTH to avoid 'long' overflow. - Changed type of table_def::m_size from my_size_t to ulong to reflect that m_size is the number of elements in the array, not a string/memory length. - Moved THD::max_row_length() to table.cc (as it's not depending on THD). Inlined max_row_length_blob() into this function. - More function comments - Fixed some compiler warnings when compiled without partitions. - Removed setting of LEX_STRING() arguments in declaration (portability fix). - Some trivial indentation/variable name changes. - Some trivial code simplifications: - Replaced some calls to alloc_root + memcpy to use strmake_root()/strdup_root(). - Changed some calls from memdup() to strmake() (Safety fix) - Simpler loops in client-simple.c
19 years ago
26 years ago
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
19 years ago
Manual merge from 5.0-rpl, of fixes for: 1) BUG#25507 "multi-row insert delayed + auto increment causes duplicate key entries on slave" (two concurrrent connections doing multi-row INSERT DELAYED to insert into an auto_increment column, caused replication slave to stop with "duplicate key error" (and binlog was wrong), and BUG#26116 "If multi-row INSERT DELAYED has errors, statement-based binlogging breaks" (the binlog was not accounting for all rows inserted, or slave could stop). The fix is that: in statement-based binlogging, a multi-row INSERT DELAYED is silently converted to a non-delayed INSERT. This is supposed to not affect many 5.1 users as in 5.1, the default binlog format is "mixed", which does not have the bug (the bug is only with binlog_format=STATEMENT). We should document how the system delayed_insert thread decides of its binlog format (which is not modified by this patch): this decision is taken when the thread is created and holds until it is terminated (is not affected by any later change via SET GLOBAL BINLOG_FORMAT). It is also not affected by the binlog format of the connection which issues INSERT DELAYED (this binlog format does not affect how the row will be binlogged). If one wants to change the binlog format of its server with SET GLOBAL BINLOG_FORMAT, it should do FLUSH TABLES to be sure all delayed_insert threads terminate and thus new threads are created, taking into account the new format. 2) BUG#24432 "INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values". When in an INSERT ON DUPLICATE KEY UPDATE, using an autoincrement column, we inserted some autogenerated values and also updated some rows, some autogenerated values were not used (for example, even if 10 was the largest autoinc value in the table at the start of the statement, 12 could be the first autogenerated value inserted by the statement, instead of 11). One autogenerated value was lost per updated row. Led to exhausting the range of the autoincrement column faster. Bug introduced by fix of BUG#20188; present since 5.0.24 and 5.1.12. This bug breaks replication from a pre-5.0.24/pre-5.1.12 master. But the present bugfix, as it makes INSERT ON DUP KEY UPDATE behave like pre-5.0.24/pre-5.1.12, breaks replication from a [5.0.24,5.0.34]/[5.1.12,5.1.15] master to a fixed (5.0.36/5.1.16) slave! To warn users against this when they upgrade their slave, as agreed with the support team, we add code for a fixed slave to detect that it is connected to a buggy master in a situation (INSERT ON DUP KEY UPDATE into autoinc column) likely to break replication, in which case it cannot replicate so stops and prints a message to the slave's error log and to SHOW SLAVE STATUS. For 5.0.36->[5.0.24,5.0.34] replication or 5.1.16->[5.1.12,5.1.15] replication we cannot warn as master does not know the slave's version (but we always recommended to users to have slave at least as new as master). As agreed with support, I have asked for an alert to be put into the MySQL Network Monitoring and Advisory Service. 3) note that I'll re-enable rpl_insert_id as soon as 5.1-rpl gets the changes from the main 5.1.
19 years ago
26 years ago
21 years ago
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
20 years ago
19 years ago
26 years ago
21 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
26 years ago
21 years ago
21 years ago
26 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
26 years ago
26 years ago
23 years ago
23 years ago
26 years ago
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
23 years ago
(pushing for Andrei) Bug #27417 thd->no_trans_update.stmt lost value inside of SF-exec-stack Once had been set the flag might later got reset inside of a stored routine execution stack. The reason was in that there was no check if a new statement started at time of resetting. The artifact affects most of binlogable DML queries. Notice, that multi-update is wrapped up within bug@27716 fix, multi-delete bug@29136. Fixed with saving parent's statement flag of whether the statement modified non-transactional table, and unioning (merging) the value with that was gained in mysql_execute_command. Resettling thd->no_trans_update members into thd->transaction.`member`; Asserting code; Effectively the following properties are held. 1. At the end of a substatement thd->transaction.stmt.modified_non_trans_table reflects the fact if such a table got modified by the substatement. That also respects THD::really_abort_on_warnin() requirements. 2. Eventually thd->transaction.stmt.modified_non_trans_table will be computed as the union of the values of all invoked sub-statements. That fixes this bug#27417; Computing of thd->transaction.all.modified_non_trans_table is refined to base to the stmt's value for all the case including insert .. select statement which before the patch had an extra issue bug@28960. Minor issues are covered with mysql_load, mysql_delete, and binloggin of insert in to temp_table select. The supplied test verifies limitely, mostly asserts. The ultimate testing is defered for bug@13270, bug@23333.
19 years ago
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
16 years ago
26 years ago
26 years ago
26 years ago
WL#3146 "less locking in auto_increment": this is a cleanup patch for our current auto_increment handling: new names for auto_increment variables in THD, new methods to manipulate them (see sql_class.h), some move into handler::, causing less backup/restore work when executing substatements. This makes the logic hopefully clearer, less work is is needed in mysql_insert(). By cleaning up, using different variables for different purposes (instead of one for 3 things...), we fix those bugs, which someone may want to fix in 5.0 too: BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate statement-based" BUG#20341 "stored function inserting into one auto_increment puts bad data in slave" BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE" (now if a row is updated, LAST_INSERT_ID() will return its id) and re-fixes: BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT" (already fixed differently by Ramil in 4.1) Test of documented behaviour of mysql_insert_id() (there was no test). The behaviour changes introduced are: - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value successfully inserted", instead of "the first autogenerated auto_increment value if any row was successfully inserted", see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY UPDATE, see auto_increment.test. Same for mysql_insert_id(), see mysql_client_test.c. - LAST_INSERT_ID() does not change if no autogenerated value was successfully inserted (it used to then be 0), see auto_increment.test. - if in INSERT SELECT no autogenerated value was successfully inserted, mysql_insert_id() now returns the id of the last inserted row (it already did this for INSERT VALUES), see mysql_client_test.c. - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X (it already did this for INSERT VALUES), see mysql_client_test.c. - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE, the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID influences not only the first row now. Additionally, when unlocking a table we check that the thread is not keeping a next_insert_id (as the table is unlocked that id is potentially out-of-date); forgetting about this next_insert_id is done in a new handler::ha_release_auto_increment(). Finally we prepare for engines capable of reserving finite-length intervals of auto_increment values: we store such intervals in THD. The next step (to be done by the replication team in 5.1) is to read those intervals from THD and actually store them in the statement-based binary log. NDB will be a good engine to test that.
20 years ago
26 years ago
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
16 years ago
26 years ago
26 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
5.1 version of a fix and test cases for bugs: Bug#4968 ""Stored procedure crash if cursor opened on altered table" Bug#6895 "Prepared Statements: ALTER TABLE DROP COLUMN does nothing" Bug#19182 "CREATE TABLE bar (m INT) SELECT n FROM foo; doesn't work from stored procedure." Bug#19733 "Repeated alter, or repeated create/drop, fails" Bug#22060 "ALTER TABLE x AUTO_INCREMENT=y in SP crashes server" Bug#24879 "Prepared Statements: CREATE TABLE (UTF8 KEY) produces a growing key length" (this bug is not fixed in 5.0) Re-execution of CREATE DATABASE, CREATE TABLE and ALTER TABLE statements in stored routines or as prepared statements caused incorrect results (and crashes in versions prior to 5.0.25). In 5.1 the problem occured only for CREATE DATABASE, CREATE TABLE SELECT and CREATE TABLE with INDEX/DATA DIRECTOY options). The problem of bugs 4968, 19733, 19282 and 6895 was that functions mysql_prepare_table, mysql_create_table and mysql_alter_table are not re-execution friendly: during their operation they modify contents of LEX (members create_info, alter_info, key_list, create_list), thus making the LEX unusable for the next execution. In particular, these functions removed processed columns and keys from create_list, key_list and drop_list. Search the code in sql_table.cc for drop_it.remove() and similar patterns to find evidence. The fix is to supply to these functions a usable copy of each of the above structures at every re-execution of an SQL statement. To simplify memory management, LEX::key_list and LEX::create_list were added to LEX::alter_info, a fresh copy of which is created for every execution. The problem of crashing bug 22060 stemmed from the fact that the above metnioned functions were not only modifying HA_CREATE_INFO structure in LEX, but also were changing it to point to areas in volatile memory of the execution memory root. The patch solves this problem by creating and using an on-stack copy of HA_CREATE_INFO in mysql_execute_command. Additionally, this patch splits the part of mysql_alter_table that analizes and rewrites information from the parser into a separate function - mysql_prepare_alter_table, in analogy with mysql_prepare_table, which is renamed to mysql_prepare_create_table.
19 years ago
20 years ago
20 years ago
20 years ago
Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE corrupts a MERGE table Bug 26867 - LOCK TABLES + REPAIR + merge table result in memory/cpu hogging Bug 26377 - Deadlock with MERGE and FLUSH TABLE Bug 25038 - Waiting TRUNCATE Bug 25700 - merge base tables get corrupted by optimize/analyze/repair table Bug 30275 - Merge tables: flush tables or unlock tables causes server to crash Bug 19627 - temporary merge table locking Bug 27660 - Falcon: merge table possible Bug 30273 - merge tables: Can't lock file (errno: 155) The problems were: Bug 26379 - Combination of FLUSH TABLE and REPAIR TABLE corrupts a MERGE table 1. A thread trying to lock a MERGE table performs busy waiting while REPAIR TABLE or a similar table administration task is ongoing on one or more of its MyISAM tables. 2. A thread trying to lock a MERGE table performs busy waiting until all threads that did REPAIR TABLE or similar table administration tasks on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK TABLES. The difference against problem #1 is that the busy waiting takes place *after* the administration task. It is terminated by UNLOCK TABLES only. 3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the lock. This does *not* require a MERGE table. The first FLUSH TABLES can be replaced by any statement that requires other threads to reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke the problem. Bug 26867 - LOCK TABLES + REPAIR + merge table result in memory/cpu hogging Trying DML on a MERGE table, which has a child locked and repaired by another thread, made an infinite loop in the server. Bug 26377 - Deadlock with MERGE and FLUSH TABLE Locking a MERGE table and its children in parent-child order and flushing the child deadlocked the server. Bug 25038 - Waiting TRUNCATE Truncating a MERGE child, while the MERGE table was in use, let the truncate fail instead of waiting for the table to become free. Bug 25700 - merge base tables get corrupted by optimize/analyze/repair table Repairing a child of an open MERGE table corrupted the child. It was necessary to FLUSH the child first. Bug 30275 - Merge tables: flush tables or unlock tables causes server to crash Flushing and optimizing locked MERGE children crashed the server. Bug 19627 - temporary merge table locking Use of a temporary MERGE table with non-temporary children could corrupt the children. Temporary tables are never locked. So we do now prohibit non-temporary chidlren of a temporary MERGE table. Bug 27660 - Falcon: merge table possible It was possible to create a MERGE table with non-MyISAM children. Bug 30273 - merge tables: Can't lock file (errno: 155) This was a Windows-only bug. Table administration statements sometimes failed with "Can't lock file (errno: 155)". These bugs are fixed by a new implementation of MERGE table open. When opening a MERGE table in open_tables() we do now add the child tables to the list of tables to be opened by open_tables() (the "query_list"). The children are not opened in the handler at this stage. After opening the parent, open_tables() opens each child from the now extended query_list. When the last child is opened, we remove the children from the query_list again and attach the children to the parent. This behaves similar to the old open. However it does not open the MyISAM tables directly, but grabs them from the already open children. When closing a MERGE table in close_thread_table() we detach the children only. Closing of the children is done implicitly because they are in thd->open_tables. For more detail see the comment at the top of ha_myisammrg.cc. Changed from open_ltable() to open_and_lock_tables() in all places that can be relevant for MERGE tables. The latter can handle tables added to the list on the fly. When open_ltable() was used in a loop over a list of tables, the list must be temporarily terminated after every table for open_and_lock_tables(). table_list->required_type is set to FRMTYPE_TABLE to avoid open of special tables. Handling of derived tables is suppressed. These details are handled by the new function open_n_lock_single_table(), which has nearly the same signature as open_ltable() and can replace it in most cases. In reopen_tables() some of the tables open by a thread can be closed and reopened. When a MERGE child is affected, the parent must be closed and reopened too. Closing of the parent is forced before the first child is closed. Reopen happens in the order of thd->open_tables. MERGE parents do not attach their children automatically at open. This is done after all tables are reopened. So all children are open when attaching them. Special lock handling like mysql_lock_abort() or mysql_lock_remove() needs to be suppressed for MERGE children or forwarded to the parent. This depends on the situation. In loops over all open tables one suppresses child lock handling. When a single table is touched, forwarding is done. Behavioral changes: =================== This patch changes the behavior of temporary MERGE tables. Temporary MERGE must have temporary children. The old behavior was wrong. A temporary table is not locked. Hence even non-temporary children were not locked. See Bug 19627 - temporary merge table locking. You cannot change the union list of a non-temporary MERGE table when LOCK TABLES is in effect. The following does *not* work: CREATE TABLE m1 ... ENGINE=MRG_MYISAM ...; LOCK TABLES t1 WRITE, t2 WRITE, m1 WRITE; ALTER TABLE m1 ... UNION=(t1,t2) ...; However, you can do this with a temporary MERGE table. You cannot create a MERGE table with CREATE ... SELECT, neither as a temporary MERGE table, nor as a non-temporary MERGE table. CREATE TABLE m1 ... ENGINE=MRG_MYISAM ... SELECT ...; Gives error message: table is not BASE TABLE.
18 years ago
20 years ago
20 years ago
5.1 version of a fix and test cases for bugs: Bug#4968 ""Stored procedure crash if cursor opened on altered table" Bug#6895 "Prepared Statements: ALTER TABLE DROP COLUMN does nothing" Bug#19182 "CREATE TABLE bar (m INT) SELECT n FROM foo; doesn't work from stored procedure." Bug#19733 "Repeated alter, or repeated create/drop, fails" Bug#22060 "ALTER TABLE x AUTO_INCREMENT=y in SP crashes server" Bug#24879 "Prepared Statements: CREATE TABLE (UTF8 KEY) produces a growing key length" (this bug is not fixed in 5.0) Re-execution of CREATE DATABASE, CREATE TABLE and ALTER TABLE statements in stored routines or as prepared statements caused incorrect results (and crashes in versions prior to 5.0.25). In 5.1 the problem occured only for CREATE DATABASE, CREATE TABLE SELECT and CREATE TABLE with INDEX/DATA DIRECTOY options). The problem of bugs 4968, 19733, 19282 and 6895 was that functions mysql_prepare_table, mysql_create_table and mysql_alter_table are not re-execution friendly: during their operation they modify contents of LEX (members create_info, alter_info, key_list, create_list), thus making the LEX unusable for the next execution. In particular, these functions removed processed columns and keys from create_list, key_list and drop_list. Search the code in sql_table.cc for drop_it.remove() and similar patterns to find evidence. The fix is to supply to these functions a usable copy of each of the above structures at every re-execution of an SQL statement. To simplify memory management, LEX::key_list and LEX::create_list were added to LEX::alter_info, a fresh copy of which is created for every execution. The problem of crashing bug 22060 stemmed from the fact that the above metnioned functions were not only modifying HA_CREATE_INFO structure in LEX, but also were changing it to point to areas in volatile memory of the execution memory root. The patch solves this problem by creating and using an on-stack copy of HA_CREATE_INFO in mysql_execute_command. Additionally, this patch splits the part of mysql_alter_table that analizes and rewrites information from the parser into a separate function - mysql_prepare_alter_table, in analogy with mysql_prepare_table, which is renamed to mysql_prepare_create_table.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
5.1 version of a fix and test cases for bugs: Bug#4968 ""Stored procedure crash if cursor opened on altered table" Bug#6895 "Prepared Statements: ALTER TABLE DROP COLUMN does nothing" Bug#19182 "CREATE TABLE bar (m INT) SELECT n FROM foo; doesn't work from stored procedure." Bug#19733 "Repeated alter, or repeated create/drop, fails" Bug#22060 "ALTER TABLE x AUTO_INCREMENT=y in SP crashes server" Bug#24879 "Prepared Statements: CREATE TABLE (UTF8 KEY) produces a growing key length" (this bug is not fixed in 5.0) Re-execution of CREATE DATABASE, CREATE TABLE and ALTER TABLE statements in stored routines or as prepared statements caused incorrect results (and crashes in versions prior to 5.0.25). In 5.1 the problem occured only for CREATE DATABASE, CREATE TABLE SELECT and CREATE TABLE with INDEX/DATA DIRECTOY options). The problem of bugs 4968, 19733, 19282 and 6895 was that functions mysql_prepare_table, mysql_create_table and mysql_alter_table are not re-execution friendly: during their operation they modify contents of LEX (members create_info, alter_info, key_list, create_list), thus making the LEX unusable for the next execution. In particular, these functions removed processed columns and keys from create_list, key_list and drop_list. Search the code in sql_table.cc for drop_it.remove() and similar patterns to find evidence. The fix is to supply to these functions a usable copy of each of the above structures at every re-execution of an SQL statement. To simplify memory management, LEX::key_list and LEX::create_list were added to LEX::alter_info, a fresh copy of which is created for every execution. The problem of crashing bug 22060 stemmed from the fact that the above metnioned functions were not only modifying HA_CREATE_INFO structure in LEX, but also were changing it to point to areas in volatile memory of the execution memory root. The patch solves this problem by creating and using an on-stack copy of HA_CREATE_INFO in mysql_execute_command. Additionally, this patch splits the part of mysql_alter_table that analizes and rewrites information from the parser into a separate function - mysql_prepare_alter_table, in analogy with mysql_prepare_table, which is renamed to mysql_prepare_create_table.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
20 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
20 years ago
A prerequisite patch for the fix for Bug#46224 "HANDLER statements within a transaction might lead to deadlocks". Introduce a notion of a sentinel to MDL_context. A sentinel is a ticket that separates all tickets in the context into two groups: before and after it. Currently we can have (and need) only one designated sentinel -- it separates all locks taken by LOCK TABLE or HANDLER statement, which must survive COMMIT and ROLLBACK and all other locks, which must be released at COMMIT or ROLLBACK. The tricky part is maintaining the sentinel up to date when someone release its corresponding ticket. This can happen, e.g. if someone issues DROP TABLE under LOCK TABLES (generally, see all calls to release_all_locks_for_name()). MDL_context::release_ticket() is modified to take care of it. ****** A fix and a test case for Bug#46224 "HANDLER statements within a transaction might lead to deadlocks". An attempt to mix HANDLER SQL statements, which are transaction- agnostic, an open multi-statement transaction, and DDL against the involved tables (in a concurrent connection) could lead to a deadlock. The deadlock would occur when HANDLER OPEN or HANDLER READ would have to wait on a conflicting metadata lock. If the connection that issued HANDLER statement also had other metadata locks (say, acquired in scope of a transaction), a classical deadlock situation of mutual wait could occur. Incompatible change: entering LOCK TABLES mode automatically closes all open HANDLERs in the current connection. Incompatible change: previously an attempt to wait on a lock in a connection that has an open HANDLER statement could wait indefinitely/deadlock. After this patch, an error ER_LOCK_DEADLOCK is produced. The idea of the fix is to merge thd->handler_mdl_context with the main mdl_context of the connection, used for transactional locks. This makes deadlock detection possible, since all waits with locks are "visible" and available to analysis in a single MDL context of the connection. Since HANDLER locks and transactional locks have a different life cycle -- HANDLERs are explicitly open and closed, and so are HANDLER locks, explicitly acquired and released, whereas transactional locks "accumulate" till the end of a transaction and are released only with COMMIT, ROLLBACK and ROLLBACK TO SAVEPOINT, a concept of "sentinel" was introduced to MDL_context. All locks, HANDLER and others, reside in the same linked list. However, a selected element of the list separates locks with different life cycle. HANDLER locks always reside at the end of the list, after the sentinel. Transactional locks are prepended to the beginning of the list, before the sentinel. Thus, ROLLBACK, COMMIT or ROLLBACK TO SAVEPOINT, only release those locks that reside before the sentinel. HANDLER locks must be released explicitly as part of HANDLER CLOSE statement, or an implicit close. The same approach with sentinel is also employed for LOCK TABLES locks. Since HANDLER and LOCK TABLES statement has never worked together, the implementation is made simple and only maintains one sentinel, which is used either for HANDLER locks, or for LOCK TABLES locks.
16 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
17 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
17 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
26 years ago
26 years ago
26 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
26 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
17 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
This changeset is largely a handler cleanup changeset (WL#3281), but includes fixes and cleanups that was found necessary while testing the handler changes Changes that requires code changes in other code of other storage engines. (Note that all changes are very straightforward and one should find all issues by compiling a --debug build and fixing all compiler errors and all asserts in field.cc while running the test suite), - New optional handler function introduced: reset() This is called after every DML statement to make it easy for a handler to statement specific cleanups. (The only case it's not called is if force the file to be closed) - handler::extra(HA_EXTRA_RESET) is removed. Code that was there before should be moved to handler::reset() - table->read_set contains a bitmap over all columns that are needed in the query. read_row() and similar functions only needs to read these columns - table->write_set contains a bitmap over all columns that will be updated in the query. write_row() and update_row() only needs to update these columns. The above bitmaps should now be up to date in all context (including ALTER TABLE, filesort()). The handler is informed of any changes to the bitmap after fix_fields() by calling the virtual function handler::column_bitmaps_signal(). If the handler does caching of these bitmaps (instead of using table->read_set, table->write_set), it should redo the caching in this code. as the signal() may be sent several times, it's probably best to set a variable in the signal and redo the caching on read_row() / write_row() if the variable was set. - Removed the read_set and write_set bitmap objects from the handler class - Removed all column bit handling functions from the handler class. (Now one instead uses the normal bitmap functions in my_bitmap.c instead of handler dedicated bitmap functions) - field->query_id is removed. One should instead instead check table->read_set and table->write_set if a field is used in the query. - handler::extra(HA_EXTRA_RETRIVE_ALL_COLS) and handler::extra(HA_EXTRA_RETRIEVE_PRIMARY_KEY) are removed. One should now instead use table->read_set to check for which columns to retrieve. - If a handler needs to call Field->val() or Field->store() on columns that are not used in the query, one should install a temporary all-columns-used map while doing so. For this, we provide the following functions: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->read_set); field->val(); dbug_tmp_restore_column_map(table->read_set, old_map); and similar for the write map: my_bitmap_map *old_map= dbug_tmp_use_all_columns(table, table->write_set); field->val(); dbug_tmp_restore_column_map(table->write_set, old_map); If this is not done, you will sooner or later hit a DBUG_ASSERT in the field store() / val() functions. (For not DBUG binaries, the dbug_tmp_restore_column_map() and dbug_tmp_restore_column_map() are inline dummy functions and should be optimized away be the compiler). - If one needs to temporary set the column map for all binaries (and not just to avoid the DBUG_ASSERT() in the Field::store() / Field::val() methods) one should use the functions tmp_use_all_columns() and tmp_restore_column_map() instead of the above dbug_ variants. - All 'status' fields in the handler base class (like records, data_file_length etc) are now stored in a 'stats' struct. This makes it easier to know what status variables are provided by the base handler. This requires some trivial variable names in the extra() function. - New virtual function handler::records(). This is called to optimize COUNT(*) if (handler::table_flags() & HA_HAS_RECORDS()) is true. (stats.records is not supposed to be an exact value. It's only has to be 'reasonable enough' for the optimizer to be able to choose a good optimization path). - Non virtual handler::init() function added for caching of virtual constants from engine. - Removed has_transactions() virtual method. Now one should instead return HA_NO_TRANSACTIONS in table_flags() if the table handler DOES NOT support transactions. - The 'xxxx_create_handler()' function now has a MEM_ROOT_root argument that is to be used with 'new handler_name()' to allocate the handler in the right area. The xxxx_create_handler() function is also responsible for any initialization of the object before returning. For example, one should change: static handler *myisam_create_handler(TABLE_SHARE *table) { return new ha_myisam(table); } -> static handler *myisam_create_handler(TABLE_SHARE *table, MEM_ROOT *mem_root) { return new (mem_root) ha_myisam(table); } - New optional virtual function: use_hidden_primary_key(). This is called in case of an update/delete when (table_flags() and HA_PRIMARY_KEY_REQUIRED_FOR_DELETE) is defined but we don't have a primary key. This allows the handler to take precisions in remembering any hidden primary key to able to update/delete any found row. The default handler marks all columns to be read. - handler::table_flags() now returns a ulonglong (to allow for more flags). - New/changed table_flags() - HA_HAS_RECORDS Set if ::records() is supported - HA_NO_TRANSACTIONS Set if engine doesn't support transactions - HA_PRIMARY_KEY_REQUIRED_FOR_DELETE Set if we should mark all primary key columns for read when reading rows as part of a DELETE statement. If there is no primary key, all columns are marked for read. - HA_PARTIAL_COLUMN_READ Set if engine will not read all columns in some cases (based on table->read_set) - HA_PRIMARY_KEY_ALLOW_RANDOM_ACCESS Renamed to HA_PRIMARY_KEY_REQUIRED_FOR_POSITION. - HA_DUPP_POS Renamed to HA_DUPLICATE_POS - HA_REQUIRES_KEY_COLUMNS_FOR_DELETE Set this if we should mark ALL key columns for read when when reading rows as part of a DELETE statement. In case of an update we will mark all keys for read for which key part changed value. - HA_STATS_RECORDS_IS_EXACT Set this if stats.records is exact. (This saves us some extra records() calls when optimizing COUNT(*)) - Removed table_flags() - HA_NOT_EXACT_COUNT Now one should instead use HA_HAS_RECORDS if handler::records() gives an exact count() and HA_STATS_RECORDS_IS_EXACT if stats.records is exact. - HA_READ_RND_SAME Removed (no one supported this one) - Removed not needed functions ha_retrieve_all_cols() and ha_retrieve_all_pk() - Renamed handler::dupp_pos to handler::dup_pos - Removed not used variable handler::sortkey Upper level handler changes: - ha_reset() now does some overall checks and calls ::reset() - ha_table_flags() added. This is a cached version of table_flags(). The cache is updated on engine creation time and updated on open. MySQL level changes (not obvious from the above): - DBUG_ASSERT() added to check that column usage matches what is set in the column usage bit maps. (This found a LOT of bugs in current column marking code). - In 5.1 before, all used columns was marked in read_set and only updated columns was marked in write_set. Now we only mark columns for which we need a value in read_set. - Column bitmaps are created in open_binary_frm() and open_table_from_share(). (Before this was in table.cc) - handler::table_flags() calls are replaced with handler::ha_table_flags() - For calling field->val() you must have the corresponding bit set in table->read_set. For calling field->store() you must have the corresponding bit set in table->write_set. (There are asserts in all store()/val() functions to catch wrong usage) - thd->set_query_id is renamed to thd->mark_used_columns and instead of setting this to an integer value, this has now the values: MARK_COLUMNS_NONE, MARK_COLUMNS_READ, MARK_COLUMNS_WRITE Changed also all variables named 'set_query_id' to mark_used_columns. - In filesort() we now inform the handler of exactly which columns are needed doing the sort and choosing the rows. - The TABLE_SHARE object has a 'all_set' column bitmap one can use when one needs a column bitmap with all columns set. (This is used for table->use_all_columns() and other places) - The TABLE object has 3 column bitmaps: - def_read_set Default bitmap for columns to be read - def_write_set Default bitmap for columns to be written - tmp_set Can be used as a temporary bitmap when needed. The table object has also two pointer to bitmaps read_set and write_set that the handler should use to find out which columns are used in which way. - count() optimization now calls handler::records() instead of using handler->stats.records (if (table_flags() & HA_HAS_RECORDS) is true). - Added extra argument to Item::walk() to indicate if we should also traverse sub queries. - Added TABLE parameter to cp_buffer_from_ref() - Don't close tables created with CREATE ... SELECT but keep them in the table cache. (Faster usage of newly created tables). New interfaces: - table->clear_column_bitmaps() to initialize the bitmaps for tables at start of new statements. - table->column_bitmaps_set() to set up new column bitmaps and signal the handler about this. - table->column_bitmaps_set_no_signal() for some few cases where we need to setup new column bitmaps but don't signal the handler (as the handler has already been signaled about these before). Used for the momement only in opt_range.cc when doing ROR scans. - table->use_all_columns() to install a bitmap where all columns are marked as use in the read and the write set. - table->default_column_bitmaps() to install the normal read and write column bitmaps, but not signaling the handler about this. This is mainly used when creating TABLE instances. - table->mark_columns_needed_for_delete(), table->mark_columns_needed_for_delete() and table->mark_columns_needed_for_insert() to allow us to put additional columns in column usage maps if handler so requires. (The handler indicates what it neads in handler->table_flags()) - table->prepare_for_position() to allow us to tell handler that it needs to read primary key parts to be able to store them in future table->position() calls. (This replaces the table->file->ha_retrieve_all_pk function) - table->mark_auto_increment_column() to tell handler are going to update columns part of any auto_increment key. - table->mark_columns_used_by_index() to mark all columns that is part of an index. It will also send extra(HA_EXTRA_KEYREAD) to handler to allow it to quickly know that it only needs to read colums that are part of the key. (The handler can also use the column map for detecting this, but simpler/faster handler can just monitor the extra() call). - table->mark_columns_used_by_index_no_reset() to in addition to other columns, also mark all columns that is used by the given key. - table->restore_column_maps_after_mark_index() to restore to default column maps after a call to table->mark_columns_used_by_index(). - New item function register_field_in_read_map(), for marking used columns in table->read_map. Used by filesort() to mark all used columns - Maintain in TABLE->merge_keys set of all keys that are used in query. (Simplices some optimization loops) - Maintain Field->part_of_key_not_clustered which is like Field->part_of_key but the field in the clustered key is not assumed to be part of all index. (used in opt_range.cc for faster loops) - dbug_tmp_use_all_columns(), dbug_tmp_restore_column_map() tmp_use_all_columns() and tmp_restore_column_map() functions to temporally mark all columns as usable. The 'dbug_' version is primarily intended inside a handler when it wants to just call Field:store() & Field::val() functions, but don't need the column maps set for any other usage. (ie:: bitmap_is_set() is never called) - We can't use compare_records() to skip updates for handlers that returns a partial column set and the read_set doesn't cover all columns in the write set. The reason for this is that if we have a column marked only for write we can't in the MySQL level know if the value changed or not. The reason this worked before was that MySQL marked all to be written columns as also to be read. The new 'optimal' bitmaps exposed this 'hidden bug'. - open_table_from_share() does not anymore setup temporary MEM_ROOT object as a thread specific variable for the handler. Instead we send the to-be-used MEMROOT to get_new_handler(). (Simpler, faster code) Bugs fixed: - Column marking was not done correctly in a lot of cases. (ALTER TABLE, when using triggers, auto_increment fields etc) (Could potentially result in wrong values inserted in table handlers relying on that the old column maps or field->set_query_id was correct) Especially when it comes to triggers, there may be cases where the old code would cause lost/wrong values for NDB and/or InnoDB tables. - Split thd->options flag OPTION_STATUS_NO_TRANS_UPDATE to two flags: OPTION_STATUS_NO_TRANS_UPDATE and OPTION_KEEP_LOG. This allowed me to remove some wrong warnings about: "Some non-transactional changed tables couldn't be rolled back" - Fixed handling of INSERT .. SELECT and CREATE ... SELECT that wrongly reset (thd->options & OPTION_STATUS_NO_TRANS_UPDATE) which caused us to loose some warnings about "Some non-transactional changed tables couldn't be rolled back") - Fixed use of uninitialized memory in ha_ndbcluster.cc::delete_table() which could cause delete_table to report random failures. - Fixed core dumps for some tests when running with --debug - Added missing FN_LIBCHAR in mysql_rm_tmp_tables() (This has probably caused us to not properly remove temporary files after crash) - slow_logs was not properly initialized, which could maybe cause extra/lost entries in slow log. - If we get an duplicate row on insert, change column map to read and write all columns while retrying the operation. This is required by the definition of REPLACE and also ensures that fields that are only part of UPDATE are properly handled. This fixed a bug in NDB and REPLACE where REPLACE wrongly copied some column values from the replaced row. - For table handler that doesn't support NULL in keys, we would give an error when creating a primary key with NULL fields, even after the fields has been automaticly converted to NOT NULL. - Creating a primary key on a SPATIAL key, would fail if field was not declared as NOT NULL. Cleanups: - Removed not used condition argument to setup_tables - Removed not needed item function reset_query_id_processor(). - Field->add_index is removed. Now this is instead maintained in (field->flags & FIELD_IN_ADD_INDEX) - Field->fieldnr is removed (use field->field_index instead) - New argument to filesort() to indicate that it should return a set of row pointers (not used columns). This allowed me to remove some references to sql_command in filesort and should also enable us to return column results in some cases where we couldn't before. - Changed column bitmap handling in opt_range.cc to be aligned with TABLE bitmap, which allowed me to use bitmap functions instead of looping over all fields to create some needed bitmaps. (Faster and smaller code) - Broke up found too long lines - Moved some variable declaration at start of function for better code readability. - Removed some not used arguments from functions. (setup_fields(), mysql_prepare_insert_check_table()) - setup_fields() now takes an enum instead of an int for marking columns usage. - For internal temporary tables, use handler::write_row(), handler::delete_row() and handler::update_row() instead of handler::ha_xxxx() for faster execution. - Changed some constants to enum's and define's. - Using separate column read and write sets allows for easier checking of timestamp field was set by statement. - Remove calls to free_io_cache() as this is now done automaticly in ha_reset() - Don't build table->normalized_path as this is now identical to table->path (after bar's fixes to convert filenames) - Fixed some missed DBUG_PRINT(.."%lx") to use "0x%lx" to make it easier to do comparision with the 'convert-dbug-for-diff' tool. Things left to do in 5.1: - We wrongly log failed CREATE TABLE ... SELECT in some cases when using row based logging (as shown by testcase binlog_row_mix_innodb_myisam.result) Mats has promised to look into this. - Test that my fix for CREATE TABLE ... SELECT is indeed correct. (I added several test cases for this, but in this case it's better that someone else also tests this throughly). Lars has promosed to do this.
20 years ago
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
16 years ago
26 years ago
26 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
BUG#39934: Slave stops for engine that only support row-based logging General overview: The logic for switching to row format when binlog_format=MIXED had numerous flaws. The underlying problem was the lack of a consistent architecture. General purpose of this changeset: This changeset introduces an architecture for switching to row format when binlog_format=MIXED. It enforces the architecture where it has to. It leaves some bugs to be fixed later. It adds extensive tests to verify that unsafe statements work as expected and that appropriate errors are produced by problems with the selection of binlog format. It was not practical to split this into smaller pieces of work. Problem 1: To determine the logging mode, the code has to take several parameters into account (namely: (1) the value of binlog_format; (2) the capabilities of the engines; (3) the type of the current statement: normal, unsafe, or row injection). These parameters may conflict in several ways, namely: - binlog_format=STATEMENT for a row injection - binlog_format=STATEMENT for an unsafe statement - binlog_format=STATEMENT for an engine only supporting row logging - binlog_format=ROW for an engine only supporting statement logging - statement is unsafe and engine does not support row logging - row injection in a table that does not support statement logging - statement modifies one table that does not support row logging and one that does not support statement logging Several of these conflicts were not detected, or were detected with an inappropriate error message. The problem of BUG#39934 was that no appropriate error message was written for the case when an engine only supporting row logging executed a row injection with binlog_format=ROW. However, all above cases must be handled. Fix 1: Introduce new error codes (sql/share/errmsg.txt). Ensure that all conditions are detected and handled in decide_logging_format() Problem 2: The binlog format shall be determined once per statement, in decide_logging_format(). It shall not be changed before or after that. Before decide_logging_format() is called, all information necessary to determine the logging format must be available. This principle ensures that all unsafe statements are handled in a consistent way. However, this principle is not followed: thd->set_current_stmt_binlog_row_based_if_mixed() is called in several places, including from code executing UPDATE..LIMIT, INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and SET @@binlog_format. After Problem 1 was fixed, that caused inconsistencies where these unsafe statements would not print the appropriate warnings or errors for some of the conflicts. Fix 2: Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from code executed after decide_logging_format(). Compensate by calling the set_current_stmt_unsafe() at parse time. This way, all unsafe statements are detected by decide_logging_format(). Problem 3: INSERT DELAYED is not unsafe: it is logged in statement format even if binlog_format=MIXED, and no warning is printed even if binlog_format=STATEMENT. This is BUG#45825. Fix 3: Made INSERT DELAYED set itself to unsafe at parse time. This allows decide_logging_format() to detect that a warning should be printed or the binlog_format changed. Problem 4: LIMIT clause were not marked as unsafe when executed inside stored functions/triggers/views/prepared statements. This is BUG#45785. Fix 4: Make statements containing the LIMIT clause marked as unsafe at parse time, instead of at execution time. This allows propagating unsafe-ness to the view.
17 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
26 years ago
26 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
20 years ago
26 years ago
Fix for: Bug #20662 "Infinite loop in CREATE TABLE IF NOT EXISTS ... SELECT with locked tables" Bug #20903 "Crash when using CREATE TABLE .. SELECT and triggers" Bug #24738 "CREATE TABLE ... SELECT is not isolated properly" Bug #24508 "Inconsistent results of CREATE TABLE ... SELECT when temporary table exists" Deadlock occured when one tried to execute CREATE TABLE IF NOT EXISTS ... SELECT statement under LOCK TABLES which held read lock on target table. Attempt to execute the same statement for already existing target table with triggers caused server crashes. Also concurrent execution of CREATE TABLE ... SELECT statement and other statements involving target table suffered from various races (some of which might've led to deadlocks). Finally, attempt to execute CREATE TABLE ... SELECT in case when a temporary table with same name was already present led to the insertion of data into this temporary table and creation of empty non-temporary table. All above problems stemmed from the old implementation of CREATE TABLE ... SELECT in which we created, opened and locked target table without any special protection in a separate step and not with the rest of tables used by this statement. This underminded deadlock-avoidance approach used in server and created window for races. It also excluded target table from prelocking causing problems with trigger execution. The patch solves these problems by implementing new approach to handling of CREATE TABLE ... SELECT for base tables. We try to open and lock table to be created at the same time as the rest of tables used by this statement. If such table does not exist at this moment we create and place in the table cache special placeholder for it which prevents its creation or any other usage by other threads. We still use old approach for creation of temporary tables. Note that we have separate fix for 5.0 since there we use slightly different less intrusive approach.
19 years ago
26 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
merge mysql-5.1-rep+2-delivery1 --> mysql-5.1-rpl-merge Conflicts: Text conflict in .bzr-mysql/default.conf Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test Text conflict in mysql-test/r/mysqlbinlog2.result Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result Text conflict in mysql-test/suite/rpl/r/rpl_udf.result Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test Text conflict in sql/field.h Text conflict in sql/log.cc Text conflict in sql/log_event.cc Text conflict in sql/log_event_old.cc Text conflict in sql/mysql_priv.h Text conflict in sql/share/errmsg.txt Text conflict in sql/sp.cc Text conflict in sql/sql_acl.cc Text conflict in sql/sql_base.cc Text conflict in sql/sql_class.h Text conflict in sql/sql_db.cc Text conflict in sql/sql_delete.cc Text conflict in sql/sql_insert.cc Text conflict in sql/sql_lex.cc Text conflict in sql/sql_lex.h Text conflict in sql/sql_load.cc Text conflict in sql/sql_table.cc Text conflict in sql/sql_update.cc Text conflict in sql/sql_view.cc Conflict adding files to storage/innobase. Created directory. Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory. Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved. Conflict adding files to storage/innobase/handler. Created directory. Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory. Contents conflict in storage/innobase/handler/ha_innodb.cc
16 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
26 years ago
26 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
26 years ago
Fix for bug#18437 "Wrong values inserted with a before update trigger on NDB table". SQL-layer was not marking fields which were used in triggers as such. As result these fields were not always properly retrieved/stored by handler layer. So one might got wrong values or lost changes in triggers for NDB, Federated and possibly InnoDB tables. This fix solves the problem by marking fields used in triggers appropriately. Also this patch contains the following cleanup of ha_ndbcluster code: We no longer rely on reading LEX::sql_command value in handler in order to determine if we can enable optimization which allows us to handle REPLACE statement in more efficient way by doing replaces directly in write_row() method without reporting error to SQL-layer. Instead we rely on SQL-layer informing us whether this optimization applicable by calling handler::extra() method with HA_EXTRA_WRITE_CAN_REPLACE flag. As result we no longer apply this optimzation in cases when it should not be used (e.g. if we have on delete triggers on table) and use in some additional cases when it is applicable (e.g. for LOAD DATA REPLACE). Finally this patch includes fix for bug#20728 "REPLACE does not work correctly for NDB table with PK and unique index". This was yet another problem which was caused by improper field mark-up. During row replacement fields which weren't explicity used in REPLACE statement were not marked as fields to be saved (updated) so they have retained values from old row version. The fix is to mark all table fields as set for REPLACE statement. Note that in 5.1 we already solve this problem by notifying handler that it should save values from all fields only in case when real replacement happens.
20 years ago
26 years ago
BUG#22864 (Rollback following CREATE... SELECT discards 'CREATE TABLE' from log): When row-based logging is used, the CREATE-SELECT is written as two parts: as a CREATE TABLE statement and as the rows for the table. For both transactional and non-transactional tables, the CREATE TABLE statement was written to the transaction cache, as were the rows, and on statement end, the entire transaction cache was written to the binary log if the table was non-transactional. For transactional tables, the events were kept in the transaction cache until end of transaction (or statement that were not part of a transaction). For the case when AUTOCOMMIT=0 and we are creating a transactional table using a create select, we would then keep the CREATE TABLE statement and the rows for the CREATE-SELECT, while executing the following statements. On a rollback, the transaction cache would then be cleared, which would also remove the CREATE TABLE statement. Hence no table would be created on the slave, while there is an empty table on the master. This relates to BUG#22865 where the table being created exists on the master, but not on the slave during insertion of rows into the newly created table. This occurs since the CREATE TABLE statement were still in the transaction cache until the statement finished executing, and possibly longer if the table was transactional. This patch changes the behaviour of the CREATE-SELECT statement by adding an implicit commit at the end of the statement when creating non-temporary tables. Hence, non-temporary tables will be written to the binary log on completion, and in the even of AUTOCOMMIT=0, a new transaction will be started. Temporary tables do not commit an ongoing transaction: neither as a pre- not a post-commit. The events for both transactional and non-transactional tables are saved in the transaction cache, and written to the binary log at end of the statement.
19 years ago
26 years ago
26 years ago
26 years ago
  1. /* Copyright 2000-2008 MySQL AB, 2008-2009 Sun Microsystems, Inc.
  2. This program is free software; you can redistribute it and/or modify
  3. it under the terms of the GNU General Public License as published by
  4. the Free Software Foundation; version 2 of the License.
  5. This program is distributed in the hope that it will be useful,
  6. but WITHOUT ANY WARRANTY; without even the implied warranty of
  7. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  8. GNU General Public License for more details.
  9. You should have received a copy of the GNU General Public License
  10. along with this program; if not, write to the Free Software
  11. Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
  12. /* Insert of records */
  13. /*
  14. INSERT DELAYED
  15. Insert delayed is distinguished from a normal insert by lock_type ==
  16. TL_WRITE_DELAYED instead of TL_WRITE. It first tries to open a
  17. "delayed" table (delayed_get_table()), but falls back to
  18. open_and_lock_tables() on error and proceeds as normal insert then.
  19. Opening a "delayed" table means to find a delayed insert thread that
  20. has the table open already. If this fails, a new thread is created and
  21. waited for to open and lock the table.
  22. If accessing the thread succeeded, in
  23. Delayed_insert::get_local_table() the table of the thread is copied
  24. for local use. A copy is required because the normal insert logic
  25. works on a target table, but the other threads table object must not
  26. be used. The insert logic uses the record buffer to create a record.
  27. And the delayed insert thread uses the record buffer to pass the
  28. record to the table handler. So there must be different objects. Also
  29. the copied table is not included in the lock, so that the statement
  30. can proceed even if the real table cannot be accessed at this moment.
  31. Copying a table object is not a trivial operation. Besides the TABLE
  32. object there are the field pointer array, the field objects and the
  33. record buffer. After copying the field objects, their pointers into
  34. the record must be "moved" to point to the new record buffer.
  35. After this setup the normal insert logic is used. Only that for
  36. delayed inserts write_delayed() is called instead of write_record().
  37. It inserts the rows into a queue and signals the delayed insert thread
  38. instead of writing directly to the table.
  39. The delayed insert thread awakes from the signal. It locks the table,
  40. inserts the rows from the queue, unlocks the table, and waits for the
  41. next signal. It does normally live until a FLUSH TABLES or SHUTDOWN.
  42. */
  43. #include "my_global.h" /* NO_EMBEDDED_ACCESS_CHECKS */
  44. #include "sql_priv.h"
  45. #include "unireg.h" // REQUIRED: for other includes
  46. #include "sql_insert.h"
  47. #include "sql_update.h" // compare_record
  48. #include "sql_base.h" // close_thread_tables
  49. #include "sql_cache.h" // query_cache_*
  50. #include "key.h" // key_copy
  51. #include "lock.h" // mysql_unlock_tables
  52. #include "sp_head.h"
  53. #include "sql_view.h" // check_key_in_view, insert_view_fields
  54. #include "sql_table.h" // mysql_create_table_no_lock
  55. #include "sql_acl.h" // *_ACL, check_grant_all_columns
  56. #include "sql_trigger.h"
  57. #include "sql_select.h"
  58. #include "sql_show.h"
  59. #include "slave.h"
  60. #include "sql_parse.h" // end_active_trans
  61. #include "rpl_mi.h"
  62. #include "transaction.h"
  63. #include "sql_audit.h"
  64. #ifndef EMBEDDED_LIBRARY
  65. static bool delayed_get_table(THD *thd, TABLE_LIST *table_list);
  66. static int write_delayed(THD *thd, TABLE *table, enum_duplicates duplic,
  67. LEX_STRING query, bool ignore, bool log_on);
  68. static void end_delayed_insert(THD *thd);
  69. pthread_handler_t handle_delayed_insert(void *arg);
  70. static void unlink_blobs(register TABLE *table);
  71. #endif
  72. static bool check_view_insertability(THD *thd, TABLE_LIST *view);
  73. /* Define to force use of my_malloc() if the allocated memory block is big */
  74. #ifndef HAVE_ALLOCA
  75. #define my_safe_alloca(size, min_length) my_alloca(size)
  76. #define my_safe_afree(ptr, size, min_length) my_afree(ptr)
  77. #else
  78. #define my_safe_alloca(size, min_length) ((size <= min_length) ? my_alloca(size) : my_malloc(size,MYF(0)))
  79. #define my_safe_afree(ptr, size, min_length) if (size > min_length) my_free(ptr)
  80. #endif
  81. /*
  82. Check that insert/update fields are from the same single table of a view.
  83. SYNOPSIS
  84. check_view_single_update()
  85. fields The insert/update fields to be checked.
  86. view The view for insert.
  87. map [in/out] The insert table map.
  88. DESCRIPTION
  89. This function is called in 2 cases:
  90. 1. to check insert fields. In this case *map will be set to 0.
  91. Insert fields are checked to be all from the same single underlying
  92. table of the given view. Otherwise the error is thrown. Found table
  93. map is returned in the map parameter.
  94. 2. to check update fields of the ON DUPLICATE KEY UPDATE clause.
  95. In this case *map contains table_map found on the previous call of
  96. the function to check insert fields. Update fields are checked to be
  97. from the same table as the insert fields.
  98. RETURN
  99. 0 OK
  100. 1 Error
  101. */
  102. bool check_view_single_update(List<Item> &fields, List<Item> *values,
  103. TABLE_LIST *view, table_map *map)
  104. {
  105. /* it is join view => we need to find the table for update */
  106. List_iterator_fast<Item> it(fields);
  107. Item *item;
  108. TABLE_LIST *tbl= 0; // reset for call to check_single_table()
  109. table_map tables= 0;
  110. while ((item= it++))
  111. tables|= item->used_tables();
  112. if (values)
  113. {
  114. it.init(*values);
  115. while ((item= it++))
  116. tables|= item->used_tables();
  117. }
  118. /* Convert to real table bits */
  119. tables&= ~PSEUDO_TABLE_BITS;
  120. /* Check found map against provided map */
  121. if (*map)
  122. {
  123. if (tables != *map)
  124. goto error;
  125. return FALSE;
  126. }
  127. if (view->check_single_table(&tbl, tables, view) || tbl == 0)
  128. goto error;
  129. view->table= tbl->table;
  130. *map= tables;
  131. return FALSE;
  132. error:
  133. my_error(ER_VIEW_MULTIUPDATE, MYF(0),
  134. view->view_db.str, view->view_name.str);
  135. return TRUE;
  136. }
  137. /*
  138. Check if insert fields are correct.
  139. SYNOPSIS
  140. check_insert_fields()
  141. thd The current thread.
  142. table The table for insert.
  143. fields The insert fields.
  144. values The insert values.
  145. check_unique If duplicate values should be rejected.
  146. NOTE
  147. Clears TIMESTAMP_AUTO_SET_ON_INSERT from table->timestamp_field_type
  148. or leaves it as is, depending on if timestamp should be updated or
  149. not.
  150. RETURN
  151. 0 OK
  152. -1 Error
  153. */
  154. static int check_insert_fields(THD *thd, TABLE_LIST *table_list,
  155. List<Item> &fields, List<Item> &values,
  156. bool check_unique,
  157. bool fields_and_values_from_different_maps,
  158. table_map *map)
  159. {
  160. TABLE *table= table_list->table;
  161. if (!table_list->updatable)
  162. {
  163. my_error(ER_NON_INSERTABLE_TABLE, MYF(0), table_list->alias, "INSERT");
  164. return -1;
  165. }
  166. if (fields.elements == 0 && values.elements != 0)
  167. {
  168. if (!table)
  169. {
  170. my_error(ER_VIEW_NO_INSERT_FIELD_LIST, MYF(0),
  171. table_list->view_db.str, table_list->view_name.str);
  172. return -1;
  173. }
  174. if (values.elements != table->s->fields)
  175. {
  176. my_error(ER_WRONG_VALUE_COUNT_ON_ROW, MYF(0), 1L);
  177. return -1;
  178. }
  179. #ifndef NO_EMBEDDED_ACCESS_CHECKS
  180. Field_iterator_table_ref field_it;
  181. field_it.set(table_list);
  182. if (check_grant_all_columns(thd, INSERT_ACL, &field_it))
  183. return -1;
  184. #endif
  185. clear_timestamp_auto_bits(table->timestamp_field_type,
  186. TIMESTAMP_AUTO_SET_ON_INSERT);
  187. /*
  188. No fields are provided so all fields must be provided in the values.
  189. Thus we set all bits in the write set.
  190. */
  191. bitmap_set_all(table->write_set);
  192. }
  193. else
  194. { // Part field list
  195. SELECT_LEX *select_lex= &thd->lex->select_lex;
  196. Name_resolution_context *context= &select_lex->context;
  197. Name_resolution_context_state ctx_state;
  198. int res;
  199. if (fields.elements != values.elements)
  200. {
  201. my_error(ER_WRONG_VALUE_COUNT_ON_ROW, MYF(0), 1L);
  202. return -1;
  203. }
  204. thd->dup_field= 0;
  205. select_lex->no_wrap_view_item= TRUE;
  206. /* Save the state of the current name resolution context. */
  207. ctx_state.save_state(context, table_list);
  208. /*
  209. Perform name resolution only in the first table - 'table_list',
  210. which is the table that is inserted into.
  211. */
  212. table_list->next_local= 0;
  213. context->resolve_in_table_list_only(table_list);
  214. res= setup_fields(thd, 0, fields, MARK_COLUMNS_WRITE, 0, 0);
  215. /* Restore the current context. */
  216. ctx_state.restore_state(context, table_list);
  217. thd->lex->select_lex.no_wrap_view_item= FALSE;
  218. if (res)
  219. return -1;
  220. if (table_list->effective_algorithm == VIEW_ALGORITHM_MERGE)
  221. {
  222. if (check_view_single_update(fields,
  223. fields_and_values_from_different_maps ?
  224. (List<Item>*) 0 : &values,
  225. table_list, map))
  226. return -1;
  227. table= table_list->table;
  228. }
  229. if (check_unique && thd->dup_field)
  230. {
  231. my_error(ER_FIELD_SPECIFIED_TWICE, MYF(0), thd->dup_field->field_name);
  232. return -1;
  233. }
  234. if (table->timestamp_field) // Don't automaticly set timestamp if used
  235. {
  236. if (bitmap_is_set(table->write_set,
  237. table->timestamp_field->field_index))
  238. clear_timestamp_auto_bits(table->timestamp_field_type,
  239. TIMESTAMP_AUTO_SET_ON_INSERT);
  240. else
  241. {
  242. bitmap_set_bit(table->write_set,
  243. table->timestamp_field->field_index);
  244. }
  245. }
  246. }
  247. // For the values we need select_priv
  248. #ifndef NO_EMBEDDED_ACCESS_CHECKS
  249. table->grant.want_privilege= (SELECT_ACL & ~table->grant.privilege);
  250. #endif
  251. if (check_key_in_view(thd, table_list) ||
  252. (table_list->view &&
  253. check_view_insertability(thd, table_list)))
  254. {
  255. my_error(ER_NON_INSERTABLE_TABLE, MYF(0), table_list->alias, "INSERT");
  256. return -1;
  257. }
  258. return 0;
  259. }
  260. /*
  261. Check update fields for the timestamp field.
  262. SYNOPSIS
  263. check_update_fields()
  264. thd The current thread.
  265. insert_table_list The insert table list.
  266. table The table for update.
  267. update_fields The update fields.
  268. NOTE
  269. If the update fields include the timestamp field,
  270. remove TIMESTAMP_AUTO_SET_ON_UPDATE from table->timestamp_field_type.
  271. RETURN
  272. 0 OK
  273. -1 Error
  274. */
  275. static int check_update_fields(THD *thd, TABLE_LIST *insert_table_list,
  276. List<Item> &update_fields,
  277. List<Item> &update_values, table_map *map)
  278. {
  279. TABLE *table= insert_table_list->table;
  280. my_bool timestamp_mark= 0;
  281. if (table->timestamp_field)
  282. {
  283. /*
  284. Unmark the timestamp field so that we can check if this is modified
  285. by update_fields
  286. */
  287. timestamp_mark= bitmap_test_and_clear(table->write_set,
  288. table->timestamp_field->field_index);
  289. }
  290. /* Check the fields we are going to modify */
  291. if (setup_fields(thd, 0, update_fields, MARK_COLUMNS_WRITE, 0, 0))
  292. return -1;
  293. if (insert_table_list->effective_algorithm == VIEW_ALGORITHM_MERGE &&
  294. check_view_single_update(update_fields, &update_values,
  295. insert_table_list, map))
  296. return -1;
  297. if (table->timestamp_field)
  298. {
  299. /* Don't set timestamp column if this is modified. */
  300. if (bitmap_is_set(table->write_set,
  301. table->timestamp_field->field_index))
  302. clear_timestamp_auto_bits(table->timestamp_field_type,
  303. TIMESTAMP_AUTO_SET_ON_UPDATE);
  304. if (timestamp_mark)
  305. bitmap_set_bit(table->write_set,
  306. table->timestamp_field->field_index);
  307. }
  308. return 0;
  309. }
  310. /*
  311. Prepare triggers for INSERT-like statement.
  312. SYNOPSIS
  313. prepare_triggers_for_insert_stmt()
  314. table Table to which insert will happen
  315. NOTE
  316. Prepare triggers for INSERT-like statement by marking fields
  317. used by triggers and inform handlers that batching of UPDATE/DELETE
  318. cannot be done if there are BEFORE UPDATE/DELETE triggers.
  319. */
  320. void prepare_triggers_for_insert_stmt(TABLE *table)
  321. {
  322. if (table->triggers)
  323. {
  324. if (table->triggers->has_triggers(TRG_EVENT_DELETE,
  325. TRG_ACTION_AFTER))
  326. {
  327. /*
  328. The table has AFTER DELETE triggers that might access to
  329. subject table and therefore might need delete to be done
  330. immediately. So we turn-off the batching.
  331. */
  332. (void) table->file->extra(HA_EXTRA_DELETE_CANNOT_BATCH);
  333. }
  334. if (table->triggers->has_triggers(TRG_EVENT_UPDATE,
  335. TRG_ACTION_AFTER))
  336. {
  337. /*
  338. The table has AFTER UPDATE triggers that might access to subject
  339. table and therefore might need update to be done immediately.
  340. So we turn-off the batching.
  341. */
  342. (void) table->file->extra(HA_EXTRA_UPDATE_CANNOT_BATCH);
  343. }
  344. }
  345. table->mark_columns_needed_for_insert();
  346. }
  347. /**
  348. Upgrade table-level lock of INSERT statement to TL_WRITE if
  349. a more concurrent lock is infeasible for some reason. This is
  350. necessary for engines without internal locking support (MyISAM).
  351. An engine with internal locking implementation might later
  352. downgrade the lock in handler::store_lock() method.
  353. */
  354. static
  355. void upgrade_lock_type(THD *thd, thr_lock_type *lock_type,
  356. enum_duplicates duplic,
  357. bool is_multi_insert)
  358. {
  359. if (duplic == DUP_UPDATE ||
  360. (duplic == DUP_REPLACE && *lock_type == TL_WRITE_CONCURRENT_INSERT))
  361. {
  362. *lock_type= TL_WRITE_DEFAULT;
  363. return;
  364. }
  365. if (*lock_type == TL_WRITE_DELAYED)
  366. {
  367. /*
  368. We do not use delayed threads if:
  369. - we're running in the safe mode or skip-new mode -- the
  370. feature is disabled in these modes
  371. - we're executing this statement on a replication slave --
  372. we need to ensure serial execution of queries on the
  373. slave
  374. - it is INSERT .. ON DUPLICATE KEY UPDATE - in this case the
  375. insert cannot be concurrent
  376. - this statement is directly or indirectly invoked from
  377. a stored function or trigger (under pre-locking) - to
  378. avoid deadlocks, since INSERT DELAYED involves a lock
  379. upgrade (TL_WRITE_DELAYED -> TL_WRITE) which we should not
  380. attempt while keeping other table level locks.
  381. - this statement itself may require pre-locking.
  382. We should upgrade the lock even though in most cases
  383. delayed functionality may work. Unfortunately, we can't
  384. easily identify whether the subject table is not used in
  385. the statement indirectly via a stored function or trigger:
  386. if it is used, that will lead to a deadlock between the
  387. client connection and the delayed thread.
  388. */
  389. if (specialflag & (SPECIAL_NO_NEW_FUNC | SPECIAL_SAFE_MODE) ||
  390. thd->variables.max_insert_delayed_threads == 0 ||
  391. thd->locked_tables_mode > LTM_LOCK_TABLES ||
  392. thd->lex->uses_stored_routines())
  393. {
  394. *lock_type= TL_WRITE;
  395. return;
  396. }
  397. if (thd->slave_thread)
  398. {
  399. /* Try concurrent insert */
  400. *lock_type= (duplic == DUP_UPDATE || duplic == DUP_REPLACE) ?
  401. TL_WRITE : TL_WRITE_CONCURRENT_INSERT;
  402. return;
  403. }
  404. bool log_on= (thd->variables.option_bits & OPTION_BIN_LOG ||
  405. ! (thd->security_ctx->master_access & SUPER_ACL));
  406. if (global_system_variables.binlog_format == BINLOG_FORMAT_STMT &&
  407. log_on && mysql_bin_log.is_open() && is_multi_insert)
  408. {
  409. /*
  410. Statement-based binary logging does not work in this case, because:
  411. a) two concurrent statements may have their rows intermixed in the
  412. queue, leading to autoincrement replication problems on slave (because
  413. the values generated used for one statement don't depend only on the
  414. value generated for the first row of this statement, so are not
  415. replicable)
  416. b) if first row of the statement has an error the full statement is
  417. not binlogged, while next rows of the statement may be inserted.
  418. c) if first row succeeds, statement is binlogged immediately with a
  419. zero error code (i.e. "no error"), if then second row fails, query
  420. will fail on slave too and slave will stop (wrongly believing that the
  421. master got no error).
  422. So we fallback to non-delayed INSERT.
  423. Note that to be fully correct, we should test the "binlog format which
  424. the delayed thread is going to use for this row". But in the common case
  425. where the global binlog format is not changed and the session binlog
  426. format may be changed, that is equal to the global binlog format.
  427. We test it without mutex for speed reasons (condition rarely true), and
  428. in the common case (global not changed) it is as good as without mutex;
  429. if global value is changed, anyway there is uncertainty as the delayed
  430. thread may be old and use the before-the-change value.
  431. */
  432. *lock_type= TL_WRITE;
  433. }
  434. }
  435. }
  436. /**
  437. Find or create a delayed insert thread for the first table in
  438. the table list, then open and lock the remaining tables.
  439. If a table can not be used with insert delayed, upgrade the lock
  440. and open and lock all tables using the standard mechanism.
  441. @param thd thread context
  442. @param table_list list of "descriptors" for tables referenced
  443. directly in statement SQL text.
  444. The first element in the list corresponds to
  445. the destination table for inserts, remaining
  446. tables, if any, are usually tables referenced
  447. by sub-queries in the right part of the
  448. INSERT.
  449. @return Status of the operation. In case of success 'table'
  450. member of every table_list element points to an instance of
  451. class TABLE.
  452. @sa open_and_lock_tables for more information about MySQL table
  453. level locking
  454. */
  455. static
  456. bool open_and_lock_for_insert_delayed(THD *thd, TABLE_LIST *table_list)
  457. {
  458. DBUG_ENTER("open_and_lock_for_insert_delayed");
  459. #ifndef EMBEDDED_LIBRARY
  460. if (thd->locked_tables_mode && thd->global_read_lock.is_acquired())
  461. {
  462. /*
  463. If this connection has the global read lock, the handler thread
  464. will not be able to lock the table. It will wait for the global
  465. read lock to go away, but this will never happen since the
  466. connection thread will be stuck waiting for the handler thread
  467. to open and lock the table.
  468. If we are not in locked tables mode, INSERT will seek protection
  469. against the global read lock (and fail), thus we will only get
  470. to this point in locked tables mode.
  471. */
  472. my_error(ER_CANT_UPDATE_WITH_READLOCK, MYF(0));
  473. DBUG_RETURN(TRUE);
  474. }
  475. /*
  476. In order for the deadlock detector to be able to find any deadlocks
  477. caused by the handler thread locking this table, we take the metadata
  478. lock inside the connection thread. If this goes ok, the ticket is cloned
  479. and added to the list of granted locks held by the handler thread.
  480. */
  481. MDL_ticket *mdl_savepoint= thd->mdl_context.mdl_savepoint();
  482. if (thd->mdl_context.acquire_lock(&table_list->mdl_request,
  483. thd->variables.lock_wait_timeout))
  484. /*
  485. If a lock can't be acquired, it makes no sense to try normal insert.
  486. Therefore we just abort the statement.
  487. */
  488. DBUG_RETURN(TRUE);
  489. bool error= FALSE;
  490. if (delayed_get_table(thd, table_list))
  491. error= TRUE;
  492. else if (table_list->table)
  493. {
  494. /*
  495. Open tables used for sub-selects or in stored functions, will also
  496. cache these functions.
  497. */
  498. if (open_and_lock_tables(thd, table_list->next_global, TRUE, 0))
  499. {
  500. end_delayed_insert(thd);
  501. error= TRUE;
  502. }
  503. else
  504. {
  505. /*
  506. First table was not processed by open_and_lock_tables(),
  507. we need to set updatability flag "by hand".
  508. */
  509. if (!table_list->derived && !table_list->view)
  510. table_list->updatable= 1; // usual table
  511. }
  512. }
  513. /*
  514. If a lock was acquired above, we should release it after
  515. handle_delayed_insert() has cloned the ticket. Note that acquire_lock() can
  516. succeed because the connection already has the lock. In this case the ticket
  517. will be before the mdl_savepoint and we should not release it here.
  518. */
  519. if (!thd->mdl_context.has_lock(mdl_savepoint, table_list->mdl_request.ticket))
  520. thd->mdl_context.release_lock(table_list->mdl_request.ticket);
  521. /*
  522. Reset the ticket in case we end up having to use normal insert and
  523. therefore will reopen the table and reacquire the metadata lock.
  524. */
  525. table_list->mdl_request.ticket= NULL;
  526. if (error || table_list->table)
  527. DBUG_RETURN(error);
  528. #endif
  529. /*
  530. * This is embedded library and we don't have auxiliary
  531. threads OR
  532. * a lock upgrade was requested inside delayed_get_table
  533. because
  534. - there are too many delayed insert threads OR
  535. - the table has triggers.
  536. Use a normal insert.
  537. */
  538. table_list->lock_type= TL_WRITE;
  539. DBUG_RETURN(open_and_lock_tables(thd, table_list, TRUE, 0));
  540. }
  541. /**
  542. Create a new query string for removing DELAYED keyword for
  543. multi INSERT DEALAYED statement.
  544. @param[in] thd Thread handler
  545. @param[in] buf Query string
  546. @return
  547. 0 ok
  548. 1 error
  549. */
  550. static int
  551. create_insert_stmt_from_insert_delayed(THD *thd, String *buf)
  552. {
  553. /* Make a copy of thd->query() and then remove the "DELAYED" keyword */
  554. if (buf->append(thd->query()) ||
  555. buf->replace(thd->lex->keyword_delayed_begin_offset,
  556. thd->lex->keyword_delayed_end_offset -
  557. thd->lex->keyword_delayed_begin_offset, 0))
  558. return 1;
  559. return 0;
  560. }
  561. /**
  562. INSERT statement implementation
  563. @note Like implementations of other DDL/DML in MySQL, this function
  564. relies on the caller to close the thread tables. This is done in the
  565. end of dispatch_command().
  566. */
  567. bool mysql_insert(THD *thd,TABLE_LIST *table_list,
  568. List<Item> &fields,
  569. List<List_item> &values_list,
  570. List<Item> &update_fields,
  571. List<Item> &update_values,
  572. enum_duplicates duplic,
  573. bool ignore)
  574. {
  575. int error, res;
  576. bool transactional_table, joins_freed= FALSE;
  577. bool changed;
  578. bool was_insert_delayed= (table_list->lock_type == TL_WRITE_DELAYED);
  579. uint value_count;
  580. ulong counter = 1;
  581. ulonglong id;
  582. COPY_INFO info;
  583. TABLE *table= 0;
  584. List_iterator_fast<List_item> its(values_list);
  585. List_item *values;
  586. Name_resolution_context *context;
  587. Name_resolution_context_state ctx_state;
  588. #ifndef EMBEDDED_LIBRARY
  589. char *query= thd->query();
  590. /*
  591. log_on is about delayed inserts only.
  592. By default, both logs are enabled (this won't cause problems if the server
  593. runs without --log-bin).
  594. */
  595. bool log_on= ((thd->variables.option_bits & OPTION_BIN_LOG) ||
  596. (!(thd->security_ctx->master_access & SUPER_ACL)));
  597. #endif
  598. thr_lock_type lock_type;
  599. Item *unused_conds= 0;
  600. DBUG_ENTER("mysql_insert");
  601. /*
  602. Upgrade lock type if the requested lock is incompatible with
  603. the current connection mode or table operation.
  604. */
  605. upgrade_lock_type(thd, &table_list->lock_type, duplic,
  606. values_list.elements > 1);
  607. /*
  608. We can't write-delayed into a table locked with LOCK TABLES:
  609. this will lead to a deadlock, since the delayed thread will
  610. never be able to get a lock on the table. QQQ: why not
  611. upgrade the lock here instead?
  612. */
  613. if (table_list->lock_type == TL_WRITE_DELAYED &&
  614. thd->locked_tables_mode &&
  615. find_locked_table(thd->open_tables, table_list->db,
  616. table_list->table_name))
  617. {
  618. my_error(ER_DELAYED_INSERT_TABLE_LOCKED, MYF(0),
  619. table_list->table_name);
  620. DBUG_RETURN(TRUE);
  621. }
  622. if (table_list->lock_type == TL_WRITE_DELAYED)
  623. {
  624. if (open_and_lock_for_insert_delayed(thd, table_list))
  625. DBUG_RETURN(TRUE);
  626. }
  627. else
  628. {
  629. if (open_and_lock_tables(thd, table_list, TRUE, 0))
  630. DBUG_RETURN(TRUE);
  631. }
  632. lock_type= table_list->lock_type;
  633. thd_proc_info(thd, "init");
  634. thd->used_tables=0;
  635. values= its++;
  636. value_count= values->elements;
  637. if (mysql_prepare_insert(thd, table_list, table, fields, values,
  638. update_fields, update_values, duplic, &unused_conds,
  639. FALSE,
  640. (fields.elements || !value_count ||
  641. table_list->view != 0),
  642. !ignore && (thd->variables.sql_mode &
  643. (MODE_STRICT_TRANS_TABLES |
  644. MODE_STRICT_ALL_TABLES))))
  645. goto abort;
  646. /* mysql_prepare_insert set table_list->table if it was not set */
  647. table= table_list->table;
  648. context= &thd->lex->select_lex.context;
  649. /*
  650. These three asserts test the hypothesis that the resetting of the name
  651. resolution context below is not necessary at all since the list of local
  652. tables for INSERT always consists of one table.
  653. */
  654. DBUG_ASSERT(!table_list->next_local);
  655. DBUG_ASSERT(!context->table_list->next_local);
  656. DBUG_ASSERT(!context->first_name_resolution_table->next_name_resolution_table);
  657. /* Save the state of the current name resolution context. */
  658. ctx_state.save_state(context, table_list);
  659. /*
  660. Perform name resolution only in the first table - 'table_list',
  661. which is the table that is inserted into.
  662. */
  663. table_list->next_local= 0;
  664. context->resolve_in_table_list_only(table_list);
  665. while ((values= its++))
  666. {
  667. counter++;
  668. if (values->elements != value_count)
  669. {
  670. my_error(ER_WRONG_VALUE_COUNT_ON_ROW, MYF(0), counter);
  671. goto abort;
  672. }
  673. if (setup_fields(thd, 0, *values, MARK_COLUMNS_READ, 0, 0))
  674. goto abort;
  675. }
  676. its.rewind ();
  677. /* Restore the current context. */
  678. ctx_state.restore_state(context, table_list);
  679. /*
  680. Fill in the given fields and dump it to the table file
  681. */
  682. bzero((char*) &info,sizeof(info));
  683. info.ignore= ignore;
  684. info.handle_duplicates=duplic;
  685. info.update_fields= &update_fields;
  686. info.update_values= &update_values;
  687. info.view= (table_list->view ? table_list : 0);
  688. /*
  689. Count warnings for all inserts.
  690. For single line insert, generate an error if try to set a NOT NULL field
  691. to NULL.
  692. */
  693. thd->count_cuted_fields= ((values_list.elements == 1 &&
  694. !ignore) ?
  695. CHECK_FIELD_ERROR_FOR_NULL :
  696. CHECK_FIELD_WARN);
  697. thd->cuted_fields = 0L;
  698. table->next_number_field=table->found_next_number_field;
  699. #ifdef HAVE_REPLICATION
  700. if (thd->slave_thread &&
  701. (info.handle_duplicates == DUP_UPDATE) &&
  702. (table->next_number_field != NULL) &&
  703. rpl_master_has_bug(&active_mi->rli, 24432, TRUE, NULL, NULL))
  704. goto abort;
  705. #endif
  706. error=0;
  707. thd_proc_info(thd, "update");
  708. if (duplic == DUP_REPLACE &&
  709. (!table->triggers || !table->triggers->has_delete_triggers()))
  710. table->file->extra(HA_EXTRA_WRITE_CAN_REPLACE);
  711. if (duplic == DUP_UPDATE)
  712. table->file->extra(HA_EXTRA_INSERT_WITH_UPDATE);
  713. /*
  714. let's *try* to start bulk inserts. It won't necessary
  715. start them as values_list.elements should be greater than
  716. some - handler dependent - threshold.
  717. We should not start bulk inserts if this statement uses
  718. functions or invokes triggers since they may access
  719. to the same table and therefore should not see its
  720. inconsistent state created by this optimization.
  721. So we call start_bulk_insert to perform nesessary checks on
  722. values_list.elements, and - if nothing else - to initialize
  723. the code to make the call of end_bulk_insert() below safe.
  724. */
  725. #ifndef EMBEDDED_LIBRARY
  726. if (lock_type != TL_WRITE_DELAYED)
  727. #endif /* EMBEDDED_LIBRARY */
  728. {
  729. if (duplic != DUP_ERROR || ignore)
  730. table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
  731. /**
  732. This is a simple check for the case when the table has a trigger
  733. that reads from it, or when the statement invokes a stored function
  734. that reads from the table being inserted to.
  735. Engines can't handle a bulk insert in parallel with a read form the
  736. same table in the same connection.
  737. */
  738. if (thd->locked_tables_mode <= LTM_LOCK_TABLES)
  739. table->file->ha_start_bulk_insert(values_list.elements);
  740. }
  741. thd->abort_on_warning= (!ignore && (thd->variables.sql_mode &
  742. (MODE_STRICT_TRANS_TABLES |
  743. MODE_STRICT_ALL_TABLES)));
  744. prepare_triggers_for_insert_stmt(table);
  745. if (table_list->prepare_where(thd, 0, TRUE) ||
  746. table_list->prepare_check_option(thd))
  747. error= 1;
  748. while ((values= its++))
  749. {
  750. if (fields.elements || !value_count)
  751. {
  752. restore_record(table,s->default_values); // Get empty record
  753. if (fill_record_n_invoke_before_triggers(thd, fields, *values, 0,
  754. table->triggers,
  755. TRG_EVENT_INSERT))
  756. {
  757. if (values_list.elements != 1 && ! thd->is_error())
  758. {
  759. info.records++;
  760. continue;
  761. }
  762. /*
  763. TODO: set thd->abort_on_warning if values_list.elements == 1
  764. and check that all items return warning in case of problem with
  765. storing field.
  766. */
  767. error=1;
  768. break;
  769. }
  770. }
  771. else
  772. {
  773. if (thd->used_tables) // Column used in values()
  774. restore_record(table,s->default_values); // Get empty record
  775. else
  776. {
  777. TABLE_SHARE *share= table->s;
  778. /*
  779. Fix delete marker. No need to restore rest of record since it will
  780. be overwritten by fill_record() anyway (and fill_record() does not
  781. use default values in this case).
  782. */
  783. table->record[0][0]= share->default_values[0];
  784. /* Fix undefined null_bits. */
  785. if (share->null_bytes > 1 && share->last_null_bit_pos)
  786. {
  787. table->record[0][share->null_bytes - 1]=
  788. share->default_values[share->null_bytes - 1];
  789. }
  790. }
  791. if (fill_record_n_invoke_before_triggers(thd, table->field, *values, 0,
  792. table->triggers,
  793. TRG_EVENT_INSERT))
  794. {
  795. if (values_list.elements != 1 && ! thd->is_error())
  796. {
  797. info.records++;
  798. continue;
  799. }
  800. error=1;
  801. break;
  802. }
  803. }
  804. if ((res= table_list->view_check_option(thd,
  805. (values_list.elements == 1 ?
  806. 0 :
  807. ignore))) ==
  808. VIEW_CHECK_SKIP)
  809. continue;
  810. else if (res == VIEW_CHECK_ERROR)
  811. {
  812. error= 1;
  813. break;
  814. }
  815. #ifndef EMBEDDED_LIBRARY
  816. if (lock_type == TL_WRITE_DELAYED)
  817. {
  818. LEX_STRING const st_query = { query, thd->query_length() };
  819. error=write_delayed(thd, table, duplic, st_query, ignore, log_on);
  820. query=0;
  821. }
  822. else
  823. #endif
  824. error=write_record(thd, table ,&info);
  825. if (error)
  826. break;
  827. thd->warning_info->inc_current_row_for_warning();
  828. }
  829. free_underlaid_joins(thd, &thd->lex->select_lex);
  830. joins_freed= TRUE;
  831. /*
  832. Now all rows are inserted. Time to update logs and sends response to
  833. user
  834. */
  835. #ifndef EMBEDDED_LIBRARY
  836. if (lock_type == TL_WRITE_DELAYED)
  837. {
  838. if (!error)
  839. {
  840. info.copied=values_list.elements;
  841. end_delayed_insert(thd);
  842. }
  843. }
  844. else
  845. #endif
  846. {
  847. /*
  848. Do not do this release if this is a delayed insert, it would steal
  849. auto_inc values from the delayed_insert thread as they share TABLE.
  850. */
  851. table->file->ha_release_auto_increment();
  852. if (thd->locked_tables_mode <= LTM_LOCK_TABLES &&
  853. table->file->ha_end_bulk_insert() && !error)
  854. {
  855. table->file->print_error(my_errno,MYF(0));
  856. error=1;
  857. }
  858. if (duplic != DUP_ERROR || ignore)
  859. table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
  860. transactional_table= table->file->has_transactions();
  861. if ((changed= (info.copied || info.deleted || info.updated)))
  862. {
  863. /*
  864. Invalidate the table in the query cache if something changed.
  865. For the transactional algorithm to work the invalidation must be
  866. before binlog writing and ha_autocommit_or_rollback
  867. */
  868. query_cache_invalidate3(thd, table_list, 1);
  869. }
  870. if (thd->transaction.stmt.modified_non_trans_table)
  871. thd->transaction.all.modified_non_trans_table= TRUE;
  872. if ((changed && error <= 0) ||
  873. thd->transaction.stmt.modified_non_trans_table ||
  874. was_insert_delayed)
  875. {
  876. if (mysql_bin_log.is_open())
  877. {
  878. int errcode= 0;
  879. if (error <= 0)
  880. {
  881. /*
  882. [Guilhem wrote] Temporary errors may have filled
  883. thd->net.last_error/errno. For example if there has
  884. been a disk full error when writing the row, and it was
  885. MyISAM, then thd->net.last_error/errno will be set to
  886. "disk full"... and the mysql_file_pwrite() will wait until free
  887. space appears, and so when it finishes then the
  888. write_row() was entirely successful
  889. */
  890. /* todo: consider removing */
  891. thd->clear_error();
  892. }
  893. else
  894. errcode= query_error_code(thd, thd->killed == THD::NOT_KILLED);
  895. /* bug#22725:
  896. A query which per-row-loop can not be interrupted with
  897. KILLED, like INSERT, and that does not invoke stored
  898. routines can be binlogged with neglecting the KILLED error.
  899. If there was no error (error == zero) until after the end of
  900. inserting loop the KILLED flag that appeared later can be
  901. disregarded since previously possible invocation of stored
  902. routines did not result in any error due to the KILLED. In
  903. such case the flag is ignored for constructing binlog event.
  904. */
  905. DBUG_ASSERT(thd->killed != THD::KILL_BAD_DATA || error > 0);
  906. if (was_insert_delayed && table_list->lock_type == TL_WRITE)
  907. {
  908. /* Binlog multi INSERT DELAYED as INSERT without DELAYED. */
  909. String log_query;
  910. if (create_insert_stmt_from_insert_delayed(thd, &log_query))
  911. {
  912. sql_print_error("Event Error: An error occurred while creating query string"
  913. "for INSERT DELAYED stmt, before writing it into binary log.");
  914. error= 1;
  915. }
  916. else if (thd->binlog_query(THD::ROW_QUERY_TYPE,
  917. log_query.c_ptr(), log_query.length(),
  918. transactional_table, FALSE, FALSE,
  919. errcode))
  920. error= 1;
  921. }
  922. else if (thd->binlog_query(THD::ROW_QUERY_TYPE,
  923. thd->query(), thd->query_length(),
  924. transactional_table, FALSE, FALSE,
  925. errcode))
  926. error= 1;
  927. }
  928. }
  929. DBUG_ASSERT(transactional_table || !changed ||
  930. thd->transaction.stmt.modified_non_trans_table);
  931. }
  932. thd_proc_info(thd, "end");
  933. /*
  934. We'll report to the client this id:
  935. - if the table contains an autoincrement column and we successfully
  936. inserted an autogenerated value, the autogenerated value.
  937. - if the table contains no autoincrement column and LAST_INSERT_ID(X) was
  938. called, X.
  939. - if the table contains an autoincrement column, and some rows were
  940. inserted, the id of the last "inserted" row (if IGNORE, that value may not
  941. have been really inserted but ignored).
  942. */
  943. id= (thd->first_successful_insert_id_in_cur_stmt > 0) ?
  944. thd->first_successful_insert_id_in_cur_stmt :
  945. (thd->arg_of_last_insert_id_function ?
  946. thd->first_successful_insert_id_in_prev_stmt :
  947. ((table->next_number_field && info.copied) ?
  948. table->next_number_field->val_int() : 0));
  949. table->next_number_field=0;
  950. thd->count_cuted_fields= CHECK_FIELD_IGNORE;
  951. table->auto_increment_field_not_null= FALSE;
  952. if (duplic == DUP_REPLACE &&
  953. (!table->triggers || !table->triggers->has_delete_triggers()))
  954. table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
  955. if (error)
  956. goto abort;
  957. if (values_list.elements == 1 && (!(thd->variables.option_bits & OPTION_WARNINGS) ||
  958. !thd->cuted_fields))
  959. {
  960. my_ok(thd, info.copied + info.deleted +
  961. ((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
  962. info.touched : info.updated),
  963. id);
  964. }
  965. else
  966. {
  967. char buff[160];
  968. ha_rows updated=((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
  969. info.touched : info.updated);
  970. if (ignore)
  971. sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
  972. (lock_type == TL_WRITE_DELAYED) ? (ulong) 0 :
  973. (ulong) (info.records - info.copied),
  974. (ulong) thd->warning_info->statement_warn_count());
  975. else
  976. sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
  977. (ulong) (info.deleted + updated),
  978. (ulong) thd->warning_info->statement_warn_count());
  979. ::my_ok(thd, info.copied + info.deleted + updated, id, buff);
  980. }
  981. thd->abort_on_warning= 0;
  982. DBUG_RETURN(FALSE);
  983. abort:
  984. #ifndef EMBEDDED_LIBRARY
  985. if (lock_type == TL_WRITE_DELAYED)
  986. end_delayed_insert(thd);
  987. #endif
  988. if (table != NULL)
  989. table->file->ha_release_auto_increment();
  990. if (!joins_freed)
  991. free_underlaid_joins(thd, &thd->lex->select_lex);
  992. thd->abort_on_warning= 0;
  993. DBUG_RETURN(TRUE);
  994. }
  995. /*
  996. Additional check for insertability for VIEW
  997. SYNOPSIS
  998. check_view_insertability()
  999. thd - thread handler
  1000. view - reference on VIEW
  1001. IMPLEMENTATION
  1002. A view is insertable if the folloings are true:
  1003. - All columns in the view are columns from a table
  1004. - All not used columns in table have a default values
  1005. - All field in view are unique (not referring to the same column)
  1006. RETURN
  1007. FALSE - OK
  1008. view->contain_auto_increment is 1 if and only if the view contains an
  1009. auto_increment field
  1010. TRUE - can't be used for insert
  1011. */
  1012. static bool check_view_insertability(THD * thd, TABLE_LIST *view)
  1013. {
  1014. uint num= view->view->select_lex.item_list.elements;
  1015. TABLE *table= view->table;
  1016. Field_translator *trans_start= view->field_translation,
  1017. *trans_end= trans_start + num;
  1018. Field_translator *trans;
  1019. uint used_fields_buff_size= bitmap_buffer_size(table->s->fields);
  1020. uint32 *used_fields_buff= (uint32*)thd->alloc(used_fields_buff_size);
  1021. MY_BITMAP used_fields;
  1022. enum_mark_columns save_mark_used_columns= thd->mark_used_columns;
  1023. DBUG_ENTER("check_key_in_view");
  1024. if (!used_fields_buff)
  1025. DBUG_RETURN(TRUE); // EOM
  1026. DBUG_ASSERT(view->table != 0 && view->field_translation != 0);
  1027. (void) bitmap_init(&used_fields, used_fields_buff, table->s->fields, 0);
  1028. bitmap_clear_all(&used_fields);
  1029. view->contain_auto_increment= 0;
  1030. /*
  1031. we must not set query_id for fields as they're not
  1032. really used in this context
  1033. */
  1034. thd->mark_used_columns= MARK_COLUMNS_NONE;
  1035. /* check simplicity and prepare unique test of view */
  1036. for (trans= trans_start; trans != trans_end; trans++)
  1037. {
  1038. if (!trans->item->fixed && trans->item->fix_fields(thd, &trans->item))
  1039. {
  1040. thd->mark_used_columns= save_mark_used_columns;
  1041. DBUG_RETURN(TRUE);
  1042. }
  1043. Item_field *field;
  1044. /* simple SELECT list entry (field without expression) */
  1045. if (!(field= trans->item->filed_for_view_update()))
  1046. {
  1047. thd->mark_used_columns= save_mark_used_columns;
  1048. DBUG_RETURN(TRUE);
  1049. }
  1050. if (field->field->unireg_check == Field::NEXT_NUMBER)
  1051. view->contain_auto_increment= 1;
  1052. /* prepare unique test */
  1053. /*
  1054. remove collation (or other transparent for update function) if we have
  1055. it
  1056. */
  1057. trans->item= field;
  1058. }
  1059. thd->mark_used_columns= save_mark_used_columns;
  1060. /* unique test */
  1061. for (trans= trans_start; trans != trans_end; trans++)
  1062. {
  1063. /* Thanks to test above, we know that all columns are of type Item_field */
  1064. Item_field *field= (Item_field *)trans->item;
  1065. /* check fields belong to table in which we are inserting */
  1066. if (field->field->table == table &&
  1067. bitmap_fast_test_and_set(&used_fields, field->field->field_index))
  1068. DBUG_RETURN(TRUE);
  1069. }
  1070. DBUG_RETURN(FALSE);
  1071. }
  1072. /*
  1073. Check if table can be updated
  1074. SYNOPSIS
  1075. mysql_prepare_insert_check_table()
  1076. thd Thread handle
  1077. table_list Table list
  1078. fields List of fields to be updated
  1079. where Pointer to where clause
  1080. select_insert Check is making for SELECT ... INSERT
  1081. RETURN
  1082. FALSE ok
  1083. TRUE ERROR
  1084. */
  1085. static bool mysql_prepare_insert_check_table(THD *thd, TABLE_LIST *table_list,
  1086. List<Item> &fields,
  1087. bool select_insert)
  1088. {
  1089. bool insert_into_view= (table_list->view != 0);
  1090. DBUG_ENTER("mysql_prepare_insert_check_table");
  1091. /*
  1092. first table in list is the one we'll INSERT into, requires INSERT_ACL.
  1093. all others require SELECT_ACL only. the ACL requirement below is for
  1094. new leaves only anyway (view-constituents), so check for SELECT rather
  1095. than INSERT.
  1096. */
  1097. if (setup_tables_and_check_access(thd, &thd->lex->select_lex.context,
  1098. &thd->lex->select_lex.top_join_list,
  1099. table_list,
  1100. &thd->lex->select_lex.leaf_tables,
  1101. select_insert, INSERT_ACL, SELECT_ACL))
  1102. DBUG_RETURN(TRUE);
  1103. if (insert_into_view && !fields.elements)
  1104. {
  1105. thd->lex->empty_field_list_on_rset= 1;
  1106. if (!table_list->table)
  1107. {
  1108. my_error(ER_VIEW_NO_INSERT_FIELD_LIST, MYF(0),
  1109. table_list->view_db.str, table_list->view_name.str);
  1110. DBUG_RETURN(TRUE);
  1111. }
  1112. DBUG_RETURN(insert_view_fields(thd, &fields, table_list));
  1113. }
  1114. DBUG_RETURN(FALSE);
  1115. }
  1116. /*
  1117. Get extra info for tables we insert into
  1118. @param table table(TABLE object) we insert into,
  1119. might be NULL in case of view
  1120. @param table(TABLE_LIST object) or view we insert into
  1121. */
  1122. static void prepare_for_positional_update(TABLE *table, TABLE_LIST *tables)
  1123. {
  1124. if (table)
  1125. {
  1126. if(table->reginfo.lock_type != TL_WRITE_DELAYED)
  1127. table->prepare_for_position();
  1128. return;
  1129. }
  1130. DBUG_ASSERT(tables->view);
  1131. List_iterator<TABLE_LIST> it(*tables->view_tables);
  1132. TABLE_LIST *tbl;
  1133. while ((tbl= it++))
  1134. prepare_for_positional_update(tbl->table, tbl);
  1135. return;
  1136. }
  1137. /*
  1138. Prepare items in INSERT statement
  1139. SYNOPSIS
  1140. mysql_prepare_insert()
  1141. thd Thread handler
  1142. table_list Global/local table list
  1143. table Table to insert into (can be NULL if table should
  1144. be taken from table_list->table)
  1145. where Where clause (for insert ... select)
  1146. select_insert TRUE if INSERT ... SELECT statement
  1147. check_fields TRUE if need to check that all INSERT fields are
  1148. given values.
  1149. abort_on_warning whether to report if some INSERT field is not
  1150. assigned as an error (TRUE) or as a warning (FALSE).
  1151. TODO (in far future)
  1152. In cases of:
  1153. INSERT INTO t1 SELECT a, sum(a) as sum1 from t2 GROUP BY a
  1154. ON DUPLICATE KEY ...
  1155. we should be able to refer to sum1 in the ON DUPLICATE KEY part
  1156. WARNING
  1157. You MUST set table->insert_values to 0 after calling this function
  1158. before releasing the table object.
  1159. RETURN VALUE
  1160. FALSE OK
  1161. TRUE error
  1162. */
  1163. bool mysql_prepare_insert(THD *thd, TABLE_LIST *table_list,
  1164. TABLE *table, List<Item> &fields, List_item *values,
  1165. List<Item> &update_fields, List<Item> &update_values,
  1166. enum_duplicates duplic,
  1167. COND **where, bool select_insert,
  1168. bool check_fields, bool abort_on_warning)
  1169. {
  1170. SELECT_LEX *select_lex= &thd->lex->select_lex;
  1171. Name_resolution_context *context= &select_lex->context;
  1172. Name_resolution_context_state ctx_state;
  1173. bool insert_into_view= (table_list->view != 0);
  1174. bool res= 0;
  1175. table_map map= 0;
  1176. DBUG_ENTER("mysql_prepare_insert");
  1177. DBUG_PRINT("enter", ("table_list 0x%lx, table 0x%lx, view %d",
  1178. (ulong)table_list, (ulong)table,
  1179. (int)insert_into_view));
  1180. /* INSERT should have a SELECT or VALUES clause */
  1181. DBUG_ASSERT (!select_insert || !values);
  1182. /*
  1183. For subqueries in VALUES() we should not see the table in which we are
  1184. inserting (for INSERT ... SELECT this is done by changing table_list,
  1185. because INSERT ... SELECT share SELECT_LEX it with SELECT.
  1186. */
  1187. if (!select_insert)
  1188. {
  1189. for (SELECT_LEX_UNIT *un= select_lex->first_inner_unit();
  1190. un;
  1191. un= un->next_unit())
  1192. {
  1193. for (SELECT_LEX *sl= un->first_select();
  1194. sl;
  1195. sl= sl->next_select())
  1196. {
  1197. sl->context.outer_context= 0;
  1198. }
  1199. }
  1200. }
  1201. if (duplic == DUP_UPDATE)
  1202. {
  1203. /* it should be allocated before Item::fix_fields() */
  1204. if (table_list->set_insert_values(thd->mem_root))
  1205. DBUG_RETURN(TRUE);
  1206. }
  1207. if (mysql_prepare_insert_check_table(thd, table_list, fields, select_insert))
  1208. DBUG_RETURN(TRUE);
  1209. /* Prepare the fields in the statement. */
  1210. if (values)
  1211. {
  1212. /* if we have INSERT ... VALUES () we cannot have a GROUP BY clause */
  1213. DBUG_ASSERT (!select_lex->group_list.elements);
  1214. /* Save the state of the current name resolution context. */
  1215. ctx_state.save_state(context, table_list);
  1216. /*
  1217. Perform name resolution only in the first table - 'table_list',
  1218. which is the table that is inserted into.
  1219. */
  1220. table_list->next_local= 0;
  1221. context->resolve_in_table_list_only(table_list);
  1222. res= (setup_fields(thd, 0, *values, MARK_COLUMNS_READ, 0, 0) ||
  1223. check_insert_fields(thd, context->table_list, fields, *values,
  1224. !insert_into_view, 0, &map));
  1225. if (!res && check_fields)
  1226. {
  1227. bool saved_abort_on_warning= thd->abort_on_warning;
  1228. thd->abort_on_warning= abort_on_warning;
  1229. res= check_that_all_fields_are_given_values(thd,
  1230. table ? table :
  1231. context->table_list->table,
  1232. context->table_list);
  1233. thd->abort_on_warning= saved_abort_on_warning;
  1234. }
  1235. if (!res)
  1236. res= setup_fields(thd, 0, update_values, MARK_COLUMNS_READ, 0, 0);
  1237. if (!res && duplic == DUP_UPDATE)
  1238. {
  1239. select_lex->no_wrap_view_item= TRUE;
  1240. res= check_update_fields(thd, context->table_list, update_fields,
  1241. update_values, &map);
  1242. select_lex->no_wrap_view_item= FALSE;
  1243. }
  1244. /* Restore the current context. */
  1245. ctx_state.restore_state(context, table_list);
  1246. }
  1247. if (res)
  1248. DBUG_RETURN(res);
  1249. if (!table)
  1250. table= table_list->table;
  1251. if (!select_insert)
  1252. {
  1253. Item *fake_conds= 0;
  1254. TABLE_LIST *duplicate;
  1255. if ((duplicate= unique_table(thd, table_list, table_list->next_global, 1)))
  1256. {
  1257. update_non_unique_table_error(table_list, "INSERT", duplicate);
  1258. DBUG_RETURN(TRUE);
  1259. }
  1260. select_lex->fix_prepare_information(thd, &fake_conds, &fake_conds);
  1261. select_lex->first_execution= 0;
  1262. }
  1263. /*
  1264. Only call prepare_for_posistion() if we are not performing a DELAYED
  1265. operation. It will instead be executed by delayed insert thread.
  1266. */
  1267. if (duplic == DUP_UPDATE || duplic == DUP_REPLACE)
  1268. prepare_for_positional_update(table, table_list);
  1269. DBUG_RETURN(FALSE);
  1270. }
  1271. /* Check if there is more uniq keys after field */
  1272. static int last_uniq_key(TABLE *table,uint keynr)
  1273. {
  1274. /*
  1275. When an underlying storage engine informs that the unique key
  1276. conflicts are not reported in the ascending order by setting
  1277. the HA_DUPLICATE_KEY_NOT_IN_ORDER flag, we cannot rely on this
  1278. information to determine the last key conflict.
  1279. The information about the last key conflict will be used to
  1280. do a replace of the new row on the conflicting row, rather
  1281. than doing a delete (of old row) + insert (of new row).
  1282. Hence check for this flag and disable replacing the last row
  1283. by returning 0 always. Returning 0 will result in doing
  1284. a delete + insert always.
  1285. */
  1286. if (table->file->ha_table_flags() & HA_DUPLICATE_KEY_NOT_IN_ORDER)
  1287. return 0;
  1288. while (++keynr < table->s->keys)
  1289. if (table->key_info[keynr].flags & HA_NOSAME)
  1290. return 0;
  1291. return 1;
  1292. }
  1293. /*
  1294. Write a record to table with optional deleting of conflicting records,
  1295. invoke proper triggers if needed.
  1296. SYNOPSIS
  1297. write_record()
  1298. thd - thread context
  1299. table - table to which record should be written
  1300. info - COPY_INFO structure describing handling of duplicates
  1301. and which is used for counting number of records inserted
  1302. and deleted.
  1303. NOTE
  1304. Once this record will be written to table after insert trigger will
  1305. be invoked. If instead of inserting new record we will update old one
  1306. then both on update triggers will work instead. Similarly both on
  1307. delete triggers will be invoked if we will delete conflicting records.
  1308. Sets thd->transaction.stmt.modified_non_trans_table to TRUE if table which is updated didn't have
  1309. transactions.
  1310. RETURN VALUE
  1311. 0 - success
  1312. non-0 - error
  1313. */
  1314. int write_record(THD *thd, TABLE *table,COPY_INFO *info)
  1315. {
  1316. int error, trg_error= 0;
  1317. char *key=0;
  1318. MY_BITMAP *save_read_set, *save_write_set;
  1319. ulonglong prev_insert_id= table->file->next_insert_id;
  1320. ulonglong insert_id_for_cur_row= 0;
  1321. DBUG_ENTER("write_record");
  1322. info->records++;
  1323. save_read_set= table->read_set;
  1324. save_write_set= table->write_set;
  1325. if (info->handle_duplicates == DUP_REPLACE ||
  1326. info->handle_duplicates == DUP_UPDATE)
  1327. {
  1328. while ((error=table->file->ha_write_row(table->record[0])))
  1329. {
  1330. uint key_nr;
  1331. /*
  1332. If we do more than one iteration of this loop, from the second one the
  1333. row will have an explicit value in the autoinc field, which was set at
  1334. the first call of handler::update_auto_increment(). So we must save
  1335. the autogenerated value to avoid thd->insert_id_for_cur_row to become
  1336. 0.
  1337. */
  1338. if (table->file->insert_id_for_cur_row > 0)
  1339. insert_id_for_cur_row= table->file->insert_id_for_cur_row;
  1340. else
  1341. table->file->insert_id_for_cur_row= insert_id_for_cur_row;
  1342. bool is_duplicate_key_error;
  1343. if (table->file->is_fatal_error(error, HA_CHECK_DUP))
  1344. goto err;
  1345. is_duplicate_key_error= table->file->is_fatal_error(error, 0);
  1346. if (!is_duplicate_key_error)
  1347. {
  1348. /*
  1349. We come here when we had an ignorable error which is not a duplicate
  1350. key error. In this we ignore error if ignore flag is set, otherwise
  1351. report error as usual. We will not do any duplicate key processing.
  1352. */
  1353. if (info->ignore)
  1354. goto ok_or_after_trg_err; /* Ignoring a not fatal error, return 0 */
  1355. goto err;
  1356. }
  1357. if ((int) (key_nr = table->file->get_dup_key(error)) < 0)
  1358. {
  1359. error= HA_ERR_FOUND_DUPP_KEY; /* Database can't find key */
  1360. goto err;
  1361. }
  1362. /* Read all columns for the row we are going to replace */
  1363. table->use_all_columns();
  1364. /*
  1365. Don't allow REPLACE to replace a row when a auto_increment column
  1366. was used. This ensures that we don't get a problem when the
  1367. whole range of the key has been used.
  1368. */
  1369. if (info->handle_duplicates == DUP_REPLACE &&
  1370. table->next_number_field &&
  1371. key_nr == table->s->next_number_index &&
  1372. (insert_id_for_cur_row > 0))
  1373. goto err;
  1374. if (table->file->ha_table_flags() & HA_DUPLICATE_POS)
  1375. {
  1376. if (table->file->rnd_pos(table->record[1],table->file->dup_ref))
  1377. goto err;
  1378. }
  1379. else
  1380. {
  1381. if (table->file->extra(HA_EXTRA_FLUSH_CACHE)) /* Not needed with NISAM */
  1382. {
  1383. error=my_errno;
  1384. goto err;
  1385. }
  1386. if (!key)
  1387. {
  1388. if (!(key=(char*) my_safe_alloca(table->s->max_unique_length,
  1389. MAX_KEY_LENGTH)))
  1390. {
  1391. error=ENOMEM;
  1392. goto err;
  1393. }
  1394. }
  1395. key_copy((uchar*) key,table->record[0],table->key_info+key_nr,0);
  1396. if ((error=(table->file->index_read_idx_map(table->record[1],key_nr,
  1397. (uchar*) key, HA_WHOLE_KEY,
  1398. HA_READ_KEY_EXACT))))
  1399. goto err;
  1400. }
  1401. if (info->handle_duplicates == DUP_UPDATE)
  1402. {
  1403. int res= 0;
  1404. /*
  1405. We don't check for other UNIQUE keys - the first row
  1406. that matches, is updated. If update causes a conflict again,
  1407. an error is returned
  1408. */
  1409. DBUG_ASSERT(table->insert_values != NULL);
  1410. store_record(table,insert_values);
  1411. restore_record(table,record[1]);
  1412. DBUG_ASSERT(info->update_fields->elements ==
  1413. info->update_values->elements);
  1414. if (fill_record_n_invoke_before_triggers(thd, *info->update_fields,
  1415. *info->update_values,
  1416. info->ignore,
  1417. table->triggers,
  1418. TRG_EVENT_UPDATE))
  1419. goto before_trg_err;
  1420. /* CHECK OPTION for VIEW ... ON DUPLICATE KEY UPDATE ... */
  1421. if (info->view &&
  1422. (res= info->view->view_check_option(current_thd, info->ignore)) ==
  1423. VIEW_CHECK_SKIP)
  1424. goto ok_or_after_trg_err;
  1425. if (res == VIEW_CHECK_ERROR)
  1426. goto before_trg_err;
  1427. table->file->restore_auto_increment(prev_insert_id);
  1428. if (table->next_number_field)
  1429. table->file->adjust_next_insert_id_after_explicit_value(
  1430. table->next_number_field->val_int());
  1431. info->touched++;
  1432. if (!records_are_comparable(table) || compare_records(table))
  1433. {
  1434. if ((error=table->file->ha_update_row(table->record[1],
  1435. table->record[0])) &&
  1436. error != HA_ERR_RECORD_IS_THE_SAME)
  1437. {
  1438. if (info->ignore &&
  1439. !table->file->is_fatal_error(error, HA_CHECK_DUP_KEY))
  1440. {
  1441. goto ok_or_after_trg_err;
  1442. }
  1443. goto err;
  1444. }
  1445. if (error != HA_ERR_RECORD_IS_THE_SAME)
  1446. info->updated++;
  1447. else
  1448. error= 0;
  1449. /*
  1450. If ON DUP KEY UPDATE updates a row instead of inserting one, it's
  1451. like a regular UPDATE statement: it should not affect the value of a
  1452. next SELECT LAST_INSERT_ID() or mysql_insert_id().
  1453. Except if LAST_INSERT_ID(#) was in the INSERT query, which is
  1454. handled separately by THD::arg_of_last_insert_id_function.
  1455. */
  1456. insert_id_for_cur_row= table->file->insert_id_for_cur_row= 0;
  1457. trg_error= (table->triggers &&
  1458. table->triggers->process_triggers(thd, TRG_EVENT_UPDATE,
  1459. TRG_ACTION_AFTER, TRUE));
  1460. info->copied++;
  1461. }
  1462. if (table->next_number_field)
  1463. table->file->adjust_next_insert_id_after_explicit_value(
  1464. table->next_number_field->val_int());
  1465. info->touched++;
  1466. goto ok_or_after_trg_err;
  1467. }
  1468. else /* DUP_REPLACE */
  1469. {
  1470. /*
  1471. The manual defines the REPLACE semantics that it is either
  1472. an INSERT or DELETE(s) + INSERT; FOREIGN KEY checks in
  1473. InnoDB do not function in the defined way if we allow MySQL
  1474. to convert the latter operation internally to an UPDATE.
  1475. We also should not perform this conversion if we have
  1476. timestamp field with ON UPDATE which is different from DEFAULT.
  1477. Another case when conversion should not be performed is when
  1478. we have ON DELETE trigger on table so user may notice that
  1479. we cheat here. Note that it is ok to do such conversion for
  1480. tables which have ON UPDATE but have no ON DELETE triggers,
  1481. we just should not expose this fact to users by invoking
  1482. ON UPDATE triggers.
  1483. */
  1484. if (last_uniq_key(table,key_nr) &&
  1485. !table->file->referenced_by_foreign_key() &&
  1486. (table->timestamp_field_type == TIMESTAMP_NO_AUTO_SET ||
  1487. table->timestamp_field_type == TIMESTAMP_AUTO_SET_ON_BOTH) &&
  1488. (!table->triggers || !table->triggers->has_delete_triggers()))
  1489. {
  1490. if ((error=table->file->ha_update_row(table->record[1],
  1491. table->record[0])) &&
  1492. error != HA_ERR_RECORD_IS_THE_SAME)
  1493. goto err;
  1494. if (error != HA_ERR_RECORD_IS_THE_SAME)
  1495. info->deleted++;
  1496. else
  1497. error= 0;
  1498. thd->record_first_successful_insert_id_in_cur_stmt(table->file->insert_id_for_cur_row);
  1499. /*
  1500. Since we pretend that we have done insert we should call
  1501. its after triggers.
  1502. */
  1503. goto after_trg_n_copied_inc;
  1504. }
  1505. else
  1506. {
  1507. if (table->triggers &&
  1508. table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
  1509. TRG_ACTION_BEFORE, TRUE))
  1510. goto before_trg_err;
  1511. if ((error=table->file->ha_delete_row(table->record[1])))
  1512. goto err;
  1513. info->deleted++;
  1514. if (!table->file->has_transactions())
  1515. thd->transaction.stmt.modified_non_trans_table= TRUE;
  1516. if (table->triggers &&
  1517. table->triggers->process_triggers(thd, TRG_EVENT_DELETE,
  1518. TRG_ACTION_AFTER, TRUE))
  1519. {
  1520. trg_error= 1;
  1521. goto ok_or_after_trg_err;
  1522. }
  1523. /* Let us attempt do write_row() once more */
  1524. }
  1525. }
  1526. }
  1527. /*
  1528. If more than one iteration of the above while loop is done, from the second
  1529. one the row being inserted will have an explicit value in the autoinc field,
  1530. which was set at the first call of handler::update_auto_increment(). This
  1531. value is saved to avoid thd->insert_id_for_cur_row becoming 0. Use this saved
  1532. autoinc value.
  1533. */
  1534. if (table->file->insert_id_for_cur_row == 0)
  1535. table->file->insert_id_for_cur_row= insert_id_for_cur_row;
  1536. thd->record_first_successful_insert_id_in_cur_stmt(table->file->insert_id_for_cur_row);
  1537. /*
  1538. Restore column maps if they where replaced during an duplicate key
  1539. problem.
  1540. */
  1541. if (table->read_set != save_read_set ||
  1542. table->write_set != save_write_set)
  1543. table->column_bitmaps_set(save_read_set, save_write_set);
  1544. }
  1545. else if ((error=table->file->ha_write_row(table->record[0])))
  1546. {
  1547. if (!info->ignore ||
  1548. table->file->is_fatal_error(error, HA_CHECK_DUP))
  1549. goto err;
  1550. table->file->restore_auto_increment(prev_insert_id);
  1551. goto ok_or_after_trg_err;
  1552. }
  1553. after_trg_n_copied_inc:
  1554. info->copied++;
  1555. thd->record_first_successful_insert_id_in_cur_stmt(table->file->insert_id_for_cur_row);
  1556. trg_error= (table->triggers &&
  1557. table->triggers->process_triggers(thd, TRG_EVENT_INSERT,
  1558. TRG_ACTION_AFTER, TRUE));
  1559. ok_or_after_trg_err:
  1560. if (key)
  1561. my_safe_afree(key,table->s->max_unique_length,MAX_KEY_LENGTH);
  1562. if (!table->file->has_transactions())
  1563. thd->transaction.stmt.modified_non_trans_table= TRUE;
  1564. DBUG_RETURN(trg_error);
  1565. err:
  1566. info->last_errno= error;
  1567. /* current_select is NULL if this is a delayed insert */
  1568. if (thd->lex->current_select)
  1569. thd->lex->current_select->no_error= 0; // Give error
  1570. table->file->print_error(error,MYF(0));
  1571. before_trg_err:
  1572. table->file->restore_auto_increment(prev_insert_id);
  1573. if (key)
  1574. my_safe_afree(key, table->s->max_unique_length, MAX_KEY_LENGTH);
  1575. table->column_bitmaps_set(save_read_set, save_write_set);
  1576. DBUG_RETURN(1);
  1577. }
  1578. /******************************************************************************
  1579. Check that all fields with arn't null_fields are used
  1580. ******************************************************************************/
  1581. int check_that_all_fields_are_given_values(THD *thd, TABLE *entry,
  1582. TABLE_LIST *table_list)
  1583. {
  1584. int err= 0;
  1585. MY_BITMAP *write_set= entry->write_set;
  1586. for (Field **field=entry->field ; *field ; field++)
  1587. {
  1588. if (!bitmap_is_set(write_set, (*field)->field_index) &&
  1589. ((*field)->flags & NO_DEFAULT_VALUE_FLAG) &&
  1590. ((*field)->real_type() != MYSQL_TYPE_ENUM))
  1591. {
  1592. bool view= FALSE;
  1593. if (table_list)
  1594. {
  1595. table_list= table_list->top_table();
  1596. view= test(table_list->view);
  1597. }
  1598. if (view)
  1599. {
  1600. push_warning_printf(thd, MYSQL_ERROR::WARN_LEVEL_WARN,
  1601. ER_NO_DEFAULT_FOR_VIEW_FIELD,
  1602. ER(ER_NO_DEFAULT_FOR_VIEW_FIELD),
  1603. table_list->view_db.str,
  1604. table_list->view_name.str);
  1605. }
  1606. else
  1607. {
  1608. push_warning_printf(thd, MYSQL_ERROR::WARN_LEVEL_WARN,
  1609. ER_NO_DEFAULT_FOR_FIELD,
  1610. ER(ER_NO_DEFAULT_FOR_FIELD),
  1611. (*field)->field_name);
  1612. }
  1613. err= 1;
  1614. }
  1615. }
  1616. return thd->abort_on_warning ? err : 0;
  1617. }
  1618. /*****************************************************************************
  1619. Handling of delayed inserts
  1620. A thread is created for each table that one uses with the DELAYED attribute.
  1621. *****************************************************************************/
  1622. #ifndef EMBEDDED_LIBRARY
  1623. class delayed_row :public ilink {
  1624. public:
  1625. char *record;
  1626. enum_duplicates dup;
  1627. time_t start_time;
  1628. ulong sql_mode;
  1629. bool auto_increment_field_not_null;
  1630. bool query_start_used, ignore, log_query;
  1631. bool stmt_depends_on_first_successful_insert_id_in_prev_stmt;
  1632. ulonglong first_successful_insert_id_in_prev_stmt;
  1633. ulonglong forced_insert_id;
  1634. ulong auto_increment_increment;
  1635. ulong auto_increment_offset;
  1636. timestamp_auto_set_type timestamp_field_type;
  1637. LEX_STRING query;
  1638. Time_zone *time_zone;
  1639. delayed_row(LEX_STRING const query_arg, enum_duplicates dup_arg,
  1640. bool ignore_arg, bool log_query_arg)
  1641. : record(0), dup(dup_arg), ignore(ignore_arg), log_query(log_query_arg),
  1642. forced_insert_id(0), query(query_arg), time_zone(0)
  1643. {}
  1644. ~delayed_row()
  1645. {
  1646. my_free(query.str);
  1647. my_free(record);
  1648. }
  1649. };
  1650. /**
  1651. Delayed_insert - context of a thread responsible for delayed insert
  1652. into one table. When processing delayed inserts, we create an own
  1653. thread for every distinct table. Later on all delayed inserts directed
  1654. into that table are handled by a dedicated thread.
  1655. */
  1656. class Delayed_insert :public ilink {
  1657. uint locks_in_memory;
  1658. thr_lock_type delayed_lock;
  1659. public:
  1660. THD thd;
  1661. TABLE *table;
  1662. mysql_mutex_t mutex;
  1663. mysql_cond_t cond, cond_client;
  1664. volatile uint tables_in_use,stacked_inserts;
  1665. volatile bool status;
  1666. /*
  1667. When the handler thread starts, it clones a metadata lock ticket
  1668. for the table to be inserted. This is done to allow the deadlock
  1669. detector to detect deadlocks resulting from this lock.
  1670. Before this is done, the connection thread cannot safely exit
  1671. without causing problems for clone_ticket().
  1672. Once handler_thread_initialized has been set, it is safe for the
  1673. connection thread to exit.
  1674. Access to handler_thread_initialized is protected by di->mutex.
  1675. */
  1676. bool handler_thread_initialized;
  1677. COPY_INFO info;
  1678. I_List<delayed_row> rows;
  1679. ulong group_count;
  1680. TABLE_LIST table_list; // Argument
  1681. Delayed_insert()
  1682. :locks_in_memory(0), table(0),tables_in_use(0),stacked_inserts(0),
  1683. status(0), handler_thread_initialized(FALSE), group_count(0)
  1684. {
  1685. DBUG_ENTER("Delayed_insert constructor");
  1686. thd.security_ctx->user=(char*) delayed_user;
  1687. thd.security_ctx->host=(char*) my_localhost;
  1688. strmake(thd.security_ctx->priv_user, thd.security_ctx->user,
  1689. USERNAME_LENGTH);
  1690. thd.current_tablenr=0;
  1691. thd.command=COM_DELAYED_INSERT;
  1692. thd.lex->current_select= 0; // for my_message_sql
  1693. thd.lex->sql_command= SQLCOM_INSERT; // For innodb::store_lock()
  1694. /*
  1695. Statement-based replication of INSERT DELAYED has problems with
  1696. RAND() and user variables, so in mixed mode we go to row-based.
  1697. For normal commands, the unsafe flag is set at parse time.
  1698. However, since the flag is a member of the THD object, of which
  1699. the delayed_insert thread has its own copy, we must set the
  1700. statement to unsafe here and explicitly set row logging mode.
  1701. @todo set_current_stmt_binlog_format_row_if_mixed should not be
  1702. called by anything else than thd->decide_logging_format(). When
  1703. we call set_current_blah here, none of the checks in
  1704. decide_logging_format is made. We should probably call
  1705. thd->decide_logging_format() directly instead. /Sven
  1706. */
  1707. thd.lex->set_stmt_unsafe(LEX::BINLOG_STMT_UNSAFE_INSERT_DELAYED);
  1708. thd.set_current_stmt_binlog_format_row_if_mixed();
  1709. /*
  1710. Prevent changes to global.lock_wait_timeout from affecting
  1711. delayed insert threads as any timeouts in delayed inserts
  1712. are not communicated to the client.
  1713. */
  1714. thd.variables.lock_wait_timeout= LONG_TIMEOUT;
  1715. bzero((char*) &thd.net, sizeof(thd.net)); // Safety
  1716. bzero((char*) &table_list, sizeof(table_list)); // Safety
  1717. thd.system_thread= SYSTEM_THREAD_DELAYED_INSERT;
  1718. thd.security_ctx->host_or_ip= "";
  1719. bzero((char*) &info,sizeof(info));
  1720. mysql_mutex_init(key_delayed_insert_mutex, &mutex, MY_MUTEX_INIT_FAST);
  1721. mysql_cond_init(key_delayed_insert_cond, &cond, NULL);
  1722. mysql_cond_init(key_delayed_insert_cond_client, &cond_client, NULL);
  1723. mysql_mutex_lock(&LOCK_thread_count);
  1724. delayed_insert_threads++;
  1725. delayed_lock= global_system_variables.low_priority_updates ?
  1726. TL_WRITE_LOW_PRIORITY : TL_WRITE;
  1727. mysql_mutex_unlock(&LOCK_thread_count);
  1728. DBUG_VOID_RETURN;
  1729. }
  1730. ~Delayed_insert()
  1731. {
  1732. /* The following is not really needed, but just for safety */
  1733. delayed_row *row;
  1734. while ((row=rows.get()))
  1735. delete row;
  1736. if (table)
  1737. {
  1738. close_thread_tables(&thd);
  1739. thd.mdl_context.release_transactional_locks();
  1740. }
  1741. mysql_mutex_lock(&LOCK_thread_count);
  1742. mysql_mutex_destroy(&mutex);
  1743. mysql_cond_destroy(&cond);
  1744. mysql_cond_destroy(&cond_client);
  1745. thd.unlink(); // Must be unlinked under lock
  1746. my_free(thd.query());
  1747. thd.security_ctx->user= thd.security_ctx->host=0;
  1748. thread_count--;
  1749. delayed_insert_threads--;
  1750. mysql_mutex_unlock(&LOCK_thread_count);
  1751. mysql_cond_broadcast(&COND_thread_count); /* Tell main we are ready */
  1752. }
  1753. /* The following is for checking when we can delete ourselves */
  1754. inline void lock()
  1755. {
  1756. locks_in_memory++; // Assume LOCK_delay_insert
  1757. }
  1758. void unlock()
  1759. {
  1760. mysql_mutex_lock(&LOCK_delayed_insert);
  1761. if (!--locks_in_memory)
  1762. {
  1763. mysql_mutex_lock(&mutex);
  1764. if (thd.killed && ! stacked_inserts && ! tables_in_use)
  1765. {
  1766. mysql_cond_signal(&cond);
  1767. status=1;
  1768. }
  1769. mysql_mutex_unlock(&mutex);
  1770. }
  1771. mysql_mutex_unlock(&LOCK_delayed_insert);
  1772. }
  1773. inline uint lock_count() { return locks_in_memory; }
  1774. TABLE* get_local_table(THD* client_thd);
  1775. bool open_and_lock_table();
  1776. bool handle_inserts(void);
  1777. };
  1778. I_List<Delayed_insert> delayed_threads;
  1779. /**
  1780. Return an instance of delayed insert thread that can handle
  1781. inserts into a given table, if it exists. Otherwise return NULL.
  1782. */
  1783. static
  1784. Delayed_insert *find_handler(THD *thd, TABLE_LIST *table_list)
  1785. {
  1786. thd_proc_info(thd, "waiting for delay_list");
  1787. mysql_mutex_lock(&LOCK_delayed_insert); // Protect master list
  1788. I_List_iterator<Delayed_insert> it(delayed_threads);
  1789. Delayed_insert *di;
  1790. while ((di= it++))
  1791. {
  1792. if (!strcmp(table_list->db, di->table_list.db) &&
  1793. !strcmp(table_list->table_name, di->table_list.table_name))
  1794. {
  1795. di->lock();
  1796. break;
  1797. }
  1798. }
  1799. mysql_mutex_unlock(&LOCK_delayed_insert); // For unlink from list
  1800. return di;
  1801. }
  1802. /**
  1803. Attempt to find or create a delayed insert thread to handle inserts
  1804. into this table.
  1805. @return In case of success, table_list->table points to a local copy
  1806. of the delayed table or is set to NULL, which indicates a
  1807. request for lock upgrade. In case of failure, value of
  1808. table_list->table is undefined.
  1809. @retval TRUE - this thread ran out of resources OR
  1810. - a newly created delayed insert thread ran out of
  1811. resources OR
  1812. - the created thread failed to open and lock the table
  1813. (e.g. because it does not exist) OR
  1814. - the table opened in the created thread turned out to
  1815. be a view
  1816. @retval FALSE - table successfully opened OR
  1817. - too many delayed insert threads OR
  1818. - the table has triggers and we have to fall back to
  1819. a normal INSERT
  1820. Two latter cases indicate a request for lock upgrade.
  1821. XXX: why do we regard INSERT DELAYED into a view as an error and
  1822. do not simply perform a lock upgrade?
  1823. TODO: The approach with using two mutexes to work with the
  1824. delayed thread list -- LOCK_delayed_insert and
  1825. LOCK_delayed_create -- is redundant, and we only need one of
  1826. them to protect the list. The reason we have two locks is that
  1827. we do not want to block look-ups in the list while we're waiting
  1828. for the newly created thread to open the delayed table. However,
  1829. this wait itself is redundant -- we always call get_local_table
  1830. later on, and there wait again until the created thread acquires
  1831. a table lock.
  1832. As is redundant the concept of locks_in_memory, since we already
  1833. have another counter with similar semantics - tables_in_use,
  1834. both of them are devoted to counting the number of producers for
  1835. a given consumer (delayed insert thread), only at different
  1836. stages of producer-consumer relationship.
  1837. The 'status' variable in Delayed_insert is redundant
  1838. too, since there is already di->stacked_inserts.
  1839. */
  1840. static
  1841. bool delayed_get_table(THD *thd, TABLE_LIST *table_list)
  1842. {
  1843. int error;
  1844. Delayed_insert *di;
  1845. DBUG_ENTER("delayed_get_table");
  1846. /* Must be set in the parser */
  1847. DBUG_ASSERT(table_list->db);
  1848. /* Find the thread which handles this table. */
  1849. if (!(di= find_handler(thd, table_list)))
  1850. {
  1851. /*
  1852. No match. Create a new thread to handle the table, but
  1853. no more than max_insert_delayed_threads.
  1854. */
  1855. if (delayed_insert_threads >= thd->variables.max_insert_delayed_threads)
  1856. DBUG_RETURN(0);
  1857. thd_proc_info(thd, "Creating delayed handler");
  1858. mysql_mutex_lock(&LOCK_delayed_create);
  1859. /*
  1860. The first search above was done without LOCK_delayed_create.
  1861. Another thread might have created the handler in between. Search again.
  1862. */
  1863. if (! (di= find_handler(thd, table_list)))
  1864. {
  1865. if (!(di= new Delayed_insert()))
  1866. goto end_create;
  1867. mysql_mutex_lock(&LOCK_thread_count);
  1868. thread_count++;
  1869. mysql_mutex_unlock(&LOCK_thread_count);
  1870. di->thd.set_db(table_list->db, (uint) strlen(table_list->db));
  1871. di->thd.set_query(my_strdup(table_list->table_name,
  1872. MYF(MY_WME | ME_FATALERROR)), 0);
  1873. if (di->thd.db == NULL || di->thd.query() == NULL)
  1874. {
  1875. /* The error is reported */
  1876. delete di;
  1877. goto end_create;
  1878. }
  1879. di->table_list= *table_list; // Needed to open table
  1880. /* Replace volatile strings with local copies */
  1881. di->table_list.alias= di->table_list.table_name= di->thd.query();
  1882. di->table_list.db= di->thd.db;
  1883. /* We need the ticket so that it can be cloned in handle_delayed_insert */
  1884. init_mdl_requests(&di->table_list);
  1885. di->table_list.mdl_request.ticket= table_list->mdl_request.ticket;
  1886. di->lock();
  1887. mysql_mutex_lock(&di->mutex);
  1888. if ((error= mysql_thread_create(key_thread_delayed_insert,
  1889. &di->thd.real_id, &connection_attrib,
  1890. handle_delayed_insert, (void*) di)))
  1891. {
  1892. DBUG_PRINT("error",
  1893. ("Can't create thread to handle delayed insert (error %d)",
  1894. error));
  1895. mysql_mutex_unlock(&di->mutex);
  1896. di->unlock();
  1897. delete di;
  1898. my_error(ER_CANT_CREATE_THREAD, MYF(ME_FATALERROR), error);
  1899. goto end_create;
  1900. }
  1901. /*
  1902. Wait until table is open unless the handler thread or the connection
  1903. thread has been killed. Note that we in all cases must wait until the
  1904. handler thread has been properly initialized before exiting. Otherwise
  1905. we risk doing clone_ticket() on a ticket that is no longer valid.
  1906. */
  1907. thd_proc_info(thd, "waiting for handler open");
  1908. while (!di->handler_thread_initialized ||
  1909. (!di->thd.killed && !di->table && !thd->killed))
  1910. {
  1911. mysql_cond_wait(&di->cond_client, &di->mutex);
  1912. }
  1913. mysql_mutex_unlock(&di->mutex);
  1914. thd_proc_info(thd, "got old table");
  1915. if (thd->killed)
  1916. {
  1917. di->unlock();
  1918. goto end_create;
  1919. }
  1920. if (di->thd.killed)
  1921. {
  1922. if (di->thd.is_error())
  1923. {
  1924. /*
  1925. Copy the error message. Note that we don't treat fatal
  1926. errors in the delayed thread as fatal errors in the
  1927. main thread. If delayed thread was killed, we don't
  1928. want to send "Server shutdown in progress" in the
  1929. INSERT THREAD.
  1930. */
  1931. if (di->thd.stmt_da->sql_errno() == ER_SERVER_SHUTDOWN)
  1932. my_message(ER_QUERY_INTERRUPTED, ER(ER_QUERY_INTERRUPTED), MYF(0));
  1933. else
  1934. my_message(di->thd.stmt_da->sql_errno(), di->thd.stmt_da->message(),
  1935. MYF(0));
  1936. }
  1937. di->unlock();
  1938. goto end_create;
  1939. }
  1940. mysql_mutex_lock(&LOCK_delayed_insert);
  1941. delayed_threads.append(di);
  1942. mysql_mutex_unlock(&LOCK_delayed_insert);
  1943. }
  1944. mysql_mutex_unlock(&LOCK_delayed_create);
  1945. }
  1946. mysql_mutex_lock(&di->mutex);
  1947. table_list->table= di->get_local_table(thd);
  1948. mysql_mutex_unlock(&di->mutex);
  1949. if (table_list->table)
  1950. {
  1951. DBUG_ASSERT(! thd->is_error());
  1952. thd->di= di;
  1953. }
  1954. /* Unlock the delayed insert object after its last access. */
  1955. di->unlock();
  1956. DBUG_RETURN((table_list->table == NULL));
  1957. end_create:
  1958. mysql_mutex_unlock(&LOCK_delayed_create);
  1959. DBUG_RETURN(thd->is_error());
  1960. }
  1961. /**
  1962. As we can't let many client threads modify the same TABLE
  1963. structure of the dedicated delayed insert thread, we create an
  1964. own structure for each client thread. This includes a row
  1965. buffer to save the column values and new fields that point to
  1966. the new row buffer. The memory is allocated in the client
  1967. thread and is freed automatically.
  1968. @pre This function is called from the client thread. Delayed
  1969. insert thread mutex must be acquired before invoking this
  1970. function.
  1971. @return Not-NULL table object on success. NULL in case of an error,
  1972. which is set in client_thd.
  1973. */
  1974. TABLE *Delayed_insert::get_local_table(THD* client_thd)
  1975. {
  1976. my_ptrdiff_t adjust_ptrs;
  1977. Field **field,**org_field, *found_next_number_field;
  1978. TABLE *copy;
  1979. TABLE_SHARE *share;
  1980. uchar *bitmap;
  1981. DBUG_ENTER("Delayed_insert::get_local_table");
  1982. /* First request insert thread to get a lock */
  1983. status=1;
  1984. tables_in_use++;
  1985. if (!thd.lock) // Table is not locked
  1986. {
  1987. thd_proc_info(client_thd, "waiting for handler lock");
  1988. mysql_cond_signal(&cond); // Tell handler to lock table
  1989. while (!thd.killed && !thd.lock && ! client_thd->killed)
  1990. {
  1991. mysql_cond_wait(&cond_client, &mutex);
  1992. }
  1993. thd_proc_info(client_thd, "got handler lock");
  1994. if (client_thd->killed)
  1995. goto error;
  1996. if (thd.killed)
  1997. {
  1998. /*
  1999. Copy the error message. Note that we don't treat fatal
  2000. errors in the delayed thread as fatal errors in the
  2001. main thread. If delayed thread was killed, we don't
  2002. want to send "Server shutdown in progress" in the
  2003. INSERT THREAD.
  2004. The thread could be killed with an error message if
  2005. di->handle_inserts() or di->open_and_lock_table() fails.
  2006. The thread could be killed without an error message if
  2007. killed using mysql_notify_thread_having_shared_lock() or
  2008. kill_delayed_threads_for_table().
  2009. */
  2010. if (!thd.is_error() || thd.stmt_da->sql_errno() == ER_SERVER_SHUTDOWN)
  2011. my_message(ER_QUERY_INTERRUPTED, ER(ER_QUERY_INTERRUPTED), MYF(0));
  2012. else
  2013. my_message(thd.stmt_da->sql_errno(), thd.stmt_da->message(), MYF(0));
  2014. goto error;
  2015. }
  2016. }
  2017. share= table->s;
  2018. /*
  2019. Allocate memory for the TABLE object, the field pointers array, and
  2020. one record buffer of reclength size. Normally a table has three
  2021. record buffers of rec_buff_length size, which includes alignment
  2022. bytes. Since the table copy is used for creating one record only,
  2023. the other record buffers and alignment are unnecessary.
  2024. */
  2025. thd_proc_info(client_thd, "allocating local table");
  2026. copy= (TABLE*) client_thd->alloc(sizeof(*copy)+
  2027. (share->fields+1)*sizeof(Field**)+
  2028. share->reclength +
  2029. share->column_bitmap_size*2);
  2030. if (!copy)
  2031. goto error;
  2032. /* Copy the TABLE object. */
  2033. *copy= *table;
  2034. /* We don't need to change the file handler here */
  2035. /* Assign the pointers for the field pointers array and the record. */
  2036. field= copy->field= (Field**) (copy + 1);
  2037. bitmap= (uchar*) (field + share->fields + 1);
  2038. copy->record[0]= (bitmap + share->column_bitmap_size * 2);
  2039. memcpy((char*) copy->record[0], (char*) table->record[0], share->reclength);
  2040. /*
  2041. Make a copy of all fields.
  2042. The copied fields need to point into the copied record. This is done
  2043. by copying the field objects with their old pointer values and then
  2044. "move" the pointers by the distance between the original and copied
  2045. records. That way we preserve the relative positions in the records.
  2046. */
  2047. adjust_ptrs= PTR_BYTE_DIFF(copy->record[0], table->record[0]);
  2048. found_next_number_field= table->found_next_number_field;
  2049. for (org_field= table->field; *org_field; org_field++, field++)
  2050. {
  2051. if (!(*field= (*org_field)->new_field(client_thd->mem_root, copy, 1)))
  2052. goto error;
  2053. (*field)->orig_table= copy; // Remove connection
  2054. (*field)->move_field_offset(adjust_ptrs); // Point at copy->record[0]
  2055. if (*org_field == found_next_number_field)
  2056. (*field)->table->found_next_number_field= *field;
  2057. }
  2058. *field=0;
  2059. /* Adjust timestamp */
  2060. if (table->timestamp_field)
  2061. {
  2062. /* Restore offset as this may have been reset in handle_inserts */
  2063. copy->timestamp_field=
  2064. (Field_timestamp*) copy->field[share->timestamp_field_offset];
  2065. copy->timestamp_field->unireg_check= table->timestamp_field->unireg_check;
  2066. copy->timestamp_field_type= copy->timestamp_field->get_auto_set_type();
  2067. }
  2068. /* Adjust in_use for pointing to client thread */
  2069. copy->in_use= client_thd;
  2070. /* Adjust lock_count. This table object is not part of a lock. */
  2071. copy->lock_count= 0;
  2072. /* Adjust bitmaps */
  2073. copy->def_read_set.bitmap= (my_bitmap_map*) bitmap;
  2074. copy->def_write_set.bitmap= ((my_bitmap_map*)
  2075. (bitmap + share->column_bitmap_size));
  2076. copy->tmp_set.bitmap= 0; // To catch errors
  2077. bzero((char*) bitmap, share->column_bitmap_size*2);
  2078. copy->read_set= &copy->def_read_set;
  2079. copy->write_set= &copy->def_write_set;
  2080. DBUG_RETURN(copy);
  2081. /* Got fatal error */
  2082. error:
  2083. tables_in_use--;
  2084. status=1;
  2085. mysql_cond_signal(&cond); // Inform thread about abort
  2086. DBUG_RETURN(0);
  2087. }
  2088. /* Put a question in queue */
  2089. static
  2090. int write_delayed(THD *thd, TABLE *table, enum_duplicates duplic,
  2091. LEX_STRING query, bool ignore, bool log_on)
  2092. {
  2093. delayed_row *row= 0;
  2094. Delayed_insert *di=thd->di;
  2095. const Discrete_interval *forced_auto_inc;
  2096. DBUG_ENTER("write_delayed");
  2097. DBUG_PRINT("enter", ("query = '%s' length %lu", query.str,
  2098. (ulong) query.length));
  2099. thd_proc_info(thd, "waiting for handler insert");
  2100. mysql_mutex_lock(&di->mutex);
  2101. while (di->stacked_inserts >= delayed_queue_size && !thd->killed)
  2102. mysql_cond_wait(&di->cond_client, &di->mutex);
  2103. thd_proc_info(thd, "storing row into queue");
  2104. if (thd->killed)
  2105. goto err;
  2106. /*
  2107. Take a copy of the query string, if there is any. The string will
  2108. be free'ed when the row is destroyed. If there is no query string,
  2109. we don't do anything special.
  2110. */
  2111. if (query.str)
  2112. {
  2113. char *str;
  2114. if (!(str= my_strndup(query.str, query.length, MYF(MY_WME))))
  2115. goto err;
  2116. query.str= str;
  2117. }
  2118. row= new delayed_row(query, duplic, ignore, log_on);
  2119. if (row == NULL)
  2120. {
  2121. my_free(query.str);
  2122. goto err;
  2123. }
  2124. if (!(row->record= (char*) my_malloc(table->s->reclength, MYF(MY_WME))))
  2125. goto err;
  2126. memcpy(row->record, table->record[0], table->s->reclength);
  2127. row->start_time= thd->start_time;
  2128. row->query_start_used= thd->query_start_used;
  2129. /*
  2130. those are for the binlog: LAST_INSERT_ID() has been evaluated at this
  2131. time, so record does not need it, but statement-based binlogging of the
  2132. INSERT will need when the row is actually inserted.
  2133. As for SET INSERT_ID, DELAYED does not honour it (BUG#20830).
  2134. */
  2135. row->stmt_depends_on_first_successful_insert_id_in_prev_stmt=
  2136. thd->stmt_depends_on_first_successful_insert_id_in_prev_stmt;
  2137. row->first_successful_insert_id_in_prev_stmt=
  2138. thd->first_successful_insert_id_in_prev_stmt;
  2139. row->timestamp_field_type= table->timestamp_field_type;
  2140. /* Add session variable timezone
  2141. Time_zone object will not be freed even the thread is ended.
  2142. So we can get time_zone object from thread which handling delayed statement.
  2143. See the comment of my_tz_find() for detail.
  2144. */
  2145. if (thd->time_zone_used)
  2146. {
  2147. row->time_zone = thd->variables.time_zone;
  2148. }
  2149. else
  2150. {
  2151. row->time_zone = NULL;
  2152. }
  2153. /* Copy session variables. */
  2154. row->auto_increment_increment= thd->variables.auto_increment_increment;
  2155. row->auto_increment_offset= thd->variables.auto_increment_offset;
  2156. row->sql_mode= thd->variables.sql_mode;
  2157. row->auto_increment_field_not_null= table->auto_increment_field_not_null;
  2158. /* Copy the next forced auto increment value, if any. */
  2159. if ((forced_auto_inc= thd->auto_inc_intervals_forced.get_next()))
  2160. {
  2161. row->forced_insert_id= forced_auto_inc->minimum();
  2162. DBUG_PRINT("delayed", ("transmitting auto_inc: %lu",
  2163. (ulong) row->forced_insert_id));
  2164. }
  2165. di->rows.push_back(row);
  2166. di->stacked_inserts++;
  2167. di->status=1;
  2168. if (table->s->blob_fields)
  2169. unlink_blobs(table);
  2170. mysql_cond_signal(&di->cond);
  2171. thread_safe_increment(delayed_rows_in_use,&LOCK_delayed_status);
  2172. mysql_mutex_unlock(&di->mutex);
  2173. DBUG_RETURN(0);
  2174. err:
  2175. delete row;
  2176. mysql_mutex_unlock(&di->mutex);
  2177. DBUG_RETURN(1);
  2178. }
  2179. /**
  2180. Signal the delayed insert thread that this user connection
  2181. is finished using it for this statement.
  2182. */
  2183. static void end_delayed_insert(THD *thd)
  2184. {
  2185. DBUG_ENTER("end_delayed_insert");
  2186. Delayed_insert *di=thd->di;
  2187. mysql_mutex_lock(&di->mutex);
  2188. DBUG_PRINT("info",("tables in use: %d",di->tables_in_use));
  2189. if (!--di->tables_in_use || di->thd.killed)
  2190. { // Unlock table
  2191. di->status=1;
  2192. mysql_cond_signal(&di->cond);
  2193. }
  2194. mysql_mutex_unlock(&di->mutex);
  2195. DBUG_VOID_RETURN;
  2196. }
  2197. /* We kill all delayed threads when doing flush-tables */
  2198. void kill_delayed_threads(void)
  2199. {
  2200. mysql_mutex_lock(&LOCK_delayed_insert); // For unlink from list
  2201. I_List_iterator<Delayed_insert> it(delayed_threads);
  2202. Delayed_insert *di;
  2203. while ((di= it++))
  2204. {
  2205. di->thd.killed= THD::KILL_CONNECTION;
  2206. if (di->thd.mysys_var)
  2207. {
  2208. mysql_mutex_lock(&di->thd.mysys_var->mutex);
  2209. if (di->thd.mysys_var->current_cond)
  2210. {
  2211. /*
  2212. We need the following test because the main mutex may be locked
  2213. in handle_delayed_insert()
  2214. */
  2215. if (&di->mutex != di->thd.mysys_var->current_mutex)
  2216. mysql_mutex_lock(di->thd.mysys_var->current_mutex);
  2217. mysql_cond_broadcast(di->thd.mysys_var->current_cond);
  2218. if (&di->mutex != di->thd.mysys_var->current_mutex)
  2219. mysql_mutex_unlock(di->thd.mysys_var->current_mutex);
  2220. }
  2221. mysql_mutex_unlock(&di->thd.mysys_var->mutex);
  2222. }
  2223. }
  2224. mysql_mutex_unlock(&LOCK_delayed_insert); // For unlink from list
  2225. }
  2226. /**
  2227. A strategy for the prelocking algorithm which prevents the
  2228. delayed insert thread from opening tables with engines which
  2229. do not support delayed inserts.
  2230. Particularly it allows to abort open_tables() as soon as we
  2231. discover that we have opened a MERGE table, without acquiring
  2232. metadata locks on underlying tables.
  2233. */
  2234. class Delayed_prelocking_strategy : public Prelocking_strategy
  2235. {
  2236. public:
  2237. virtual bool handle_routine(THD *thd, Query_tables_list *prelocking_ctx,
  2238. Sroutine_hash_entry *rt, sp_head *sp,
  2239. bool *need_prelocking);
  2240. virtual bool handle_table(THD *thd, Query_tables_list *prelocking_ctx,
  2241. TABLE_LIST *table_list, bool *need_prelocking);
  2242. virtual bool handle_view(THD *thd, Query_tables_list *prelocking_ctx,
  2243. TABLE_LIST *table_list, bool *need_prelocking);
  2244. };
  2245. bool Delayed_prelocking_strategy::
  2246. handle_table(THD *thd, Query_tables_list *prelocking_ctx,
  2247. TABLE_LIST *table_list, bool *need_prelocking)
  2248. {
  2249. DBUG_ASSERT(table_list->lock_type == TL_WRITE_DELAYED);
  2250. if (!(table_list->table->file->ha_table_flags() & HA_CAN_INSERT_DELAYED))
  2251. {
  2252. my_error(ER_DELAYED_NOT_SUPPORTED, MYF(0), table_list->table_name);
  2253. return TRUE;
  2254. }
  2255. return FALSE;
  2256. }
  2257. bool Delayed_prelocking_strategy::
  2258. handle_routine(THD *thd, Query_tables_list *prelocking_ctx,
  2259. Sroutine_hash_entry *rt, sp_head *sp,
  2260. bool *need_prelocking)
  2261. {
  2262. /* LEX used by the delayed insert thread has no routines. */
  2263. DBUG_ASSERT(0);
  2264. return FALSE;
  2265. }
  2266. bool Delayed_prelocking_strategy::
  2267. handle_view(THD *thd, Query_tables_list *prelocking_ctx,
  2268. TABLE_LIST *table_list, bool *need_prelocking)
  2269. {
  2270. /* We don't open views in the delayed insert thread. */
  2271. DBUG_ASSERT(0);
  2272. return FALSE;
  2273. }
  2274. /**
  2275. Open and lock table for use by delayed thread and check that
  2276. this table is suitable for delayed inserts.
  2277. @retval FALSE - Success.
  2278. @retval TRUE - Failure.
  2279. */
  2280. bool Delayed_insert::open_and_lock_table()
  2281. {
  2282. Delayed_prelocking_strategy prelocking_strategy;
  2283. /*
  2284. Use special prelocking strategy to get ER_DELAYED_NOT_SUPPORTED
  2285. error for tables with engines which don't support delayed inserts.
  2286. */
  2287. if (!(table= open_n_lock_single_table(&thd, &table_list,
  2288. TL_WRITE_DELAYED,
  2289. MYSQL_OPEN_IGNORE_GLOBAL_READ_LOCK,
  2290. &prelocking_strategy)))
  2291. {
  2292. thd.fatal_error(); // Abort waiting inserts
  2293. return TRUE;
  2294. }
  2295. if (table->triggers)
  2296. {
  2297. /*
  2298. Table has triggers. This is not an error, but we do
  2299. not support triggers with delayed insert. Terminate the delayed
  2300. thread without an error and thus request lock upgrade.
  2301. */
  2302. return TRUE;
  2303. }
  2304. table->copy_blobs= 1;
  2305. return FALSE;
  2306. }
  2307. /*
  2308. * Create a new delayed insert thread
  2309. */
  2310. pthread_handler_t handle_delayed_insert(void *arg)
  2311. {
  2312. Delayed_insert *di=(Delayed_insert*) arg;
  2313. THD *thd= &di->thd;
  2314. pthread_detach_this_thread();
  2315. /* Add thread to THD list so that's it's visible in 'show processlist' */
  2316. mysql_mutex_lock(&LOCK_thread_count);
  2317. thd->thread_id= thd->variables.pseudo_thread_id= thread_id++;
  2318. thd->set_current_time();
  2319. threads.append(thd);
  2320. thd->killed=abort_loop ? THD::KILL_CONNECTION : THD::NOT_KILLED;
  2321. mysql_mutex_unlock(&LOCK_thread_count);
  2322. mysql_thread_set_psi_id(thd->thread_id);
  2323. /*
  2324. Wait until the client runs into mysql_cond_wait(),
  2325. where we free it after the table is opened and di linked in the list.
  2326. If we did not wait here, the client might detect the opened table
  2327. before it is linked to the list. It would release LOCK_delayed_create
  2328. and allow another thread to create another handler for the same table,
  2329. since it does not find one in the list.
  2330. */
  2331. mysql_mutex_lock(&di->mutex);
  2332. if (my_thread_init())
  2333. {
  2334. /* Can't use my_error since store_globals has not yet been called */
  2335. thd->stmt_da->set_error_status(thd, ER_OUT_OF_RESOURCES,
  2336. ER(ER_OUT_OF_RESOURCES), NULL);
  2337. di->handler_thread_initialized= TRUE;
  2338. }
  2339. else
  2340. {
  2341. DBUG_ENTER("handle_delayed_insert");
  2342. thd->thread_stack= (char*) &thd;
  2343. if (init_thr_lock() || thd->store_globals())
  2344. {
  2345. /* Can't use my_error since store_globals has perhaps failed */
  2346. thd->stmt_da->set_error_status(thd, ER_OUT_OF_RESOURCES,
  2347. ER(ER_OUT_OF_RESOURCES), NULL);
  2348. di->handler_thread_initialized= TRUE;
  2349. thd->fatal_error();
  2350. goto err;
  2351. }
  2352. thd->lex->sql_command= SQLCOM_INSERT; // For innodb::store_lock()
  2353. /*
  2354. Statement-based replication of INSERT DELAYED has problems with RAND()
  2355. and user vars, so in mixed mode we go to row-based.
  2356. */
  2357. thd->lex->set_stmt_unsafe(LEX::BINLOG_STMT_UNSAFE_INSERT_DELAYED);
  2358. thd->set_current_stmt_binlog_format_row_if_mixed();
  2359. /*
  2360. Clone the ticket representing the lock on the target table for
  2361. the insert and add it to the list of granted metadata locks held by
  2362. the handler thread. This is safe since the handler thread is
  2363. not holding nor waiting on any metadata locks.
  2364. */
  2365. if (thd->mdl_context.clone_ticket(&di->table_list.mdl_request))
  2366. {
  2367. di->handler_thread_initialized= TRUE;
  2368. goto err;
  2369. }
  2370. /*
  2371. Now that the ticket has been cloned, it is safe for the connection
  2372. thread to exit.
  2373. */
  2374. di->handler_thread_initialized= TRUE;
  2375. di->table_list.mdl_request.ticket= NULL;
  2376. if (di->open_and_lock_table())
  2377. goto err;
  2378. /* Tell client that the thread is initialized */
  2379. mysql_cond_signal(&di->cond_client);
  2380. /* Now wait until we get an insert or lock to handle */
  2381. /* We will not abort as long as a client thread uses this thread */
  2382. for (;;)
  2383. {
  2384. if (thd->killed)
  2385. {
  2386. uint lock_count;
  2387. /*
  2388. Remove this from delay insert list so that no one can request a
  2389. table from this
  2390. */
  2391. mysql_mutex_unlock(&di->mutex);
  2392. mysql_mutex_lock(&LOCK_delayed_insert);
  2393. di->unlink();
  2394. lock_count=di->lock_count();
  2395. mysql_mutex_unlock(&LOCK_delayed_insert);
  2396. mysql_mutex_lock(&di->mutex);
  2397. if (!lock_count && !di->tables_in_use && !di->stacked_inserts)
  2398. break; // Time to die
  2399. }
  2400. /* Shouldn't wait if killed or an insert is waiting. */
  2401. if (!thd->killed && !di->status && !di->stacked_inserts)
  2402. {
  2403. struct timespec abstime;
  2404. set_timespec(abstime, delayed_insert_timeout);
  2405. /* Information for pthread_kill */
  2406. di->thd.mysys_var->current_mutex= &di->mutex;
  2407. di->thd.mysys_var->current_cond= &di->cond;
  2408. thd_proc_info(&(di->thd), "Waiting for INSERT");
  2409. DBUG_PRINT("info",("Waiting for someone to insert rows"));
  2410. while (!thd->killed && !di->status)
  2411. {
  2412. int error;
  2413. mysql_audit_release(thd);
  2414. #if defined(HAVE_BROKEN_COND_TIMEDWAIT)
  2415. error= mysql_cond_wait(&di->cond, &di->mutex);
  2416. #else
  2417. error= mysql_cond_timedwait(&di->cond, &di->mutex, &abstime);
  2418. #ifdef EXTRA_DEBUG
  2419. if (error && error != EINTR && error != ETIMEDOUT)
  2420. {
  2421. fprintf(stderr, "Got error %d from mysql_cond_timedwait\n", error);
  2422. DBUG_PRINT("error", ("Got error %d from mysql_cond_timedwait",
  2423. error));
  2424. }
  2425. #endif
  2426. #endif
  2427. if (error == ETIMEDOUT || error == ETIME)
  2428. thd->killed= THD::KILL_CONNECTION;
  2429. }
  2430. /* We can't lock di->mutex and mysys_var->mutex at the same time */
  2431. mysql_mutex_unlock(&di->mutex);
  2432. mysql_mutex_lock(&di->thd.mysys_var->mutex);
  2433. di->thd.mysys_var->current_mutex= 0;
  2434. di->thd.mysys_var->current_cond= 0;
  2435. mysql_mutex_unlock(&di->thd.mysys_var->mutex);
  2436. mysql_mutex_lock(&di->mutex);
  2437. }
  2438. thd_proc_info(&(di->thd), 0);
  2439. if (di->tables_in_use && ! thd->lock && !thd->killed)
  2440. {
  2441. /*
  2442. Request for new delayed insert.
  2443. Lock the table, but avoid to be blocked by a global read lock.
  2444. If we got here while a global read lock exists, then one or more
  2445. inserts started before the lock was requested. These are allowed
  2446. to complete their work before the server returns control to the
  2447. client which requested the global read lock. The delayed insert
  2448. handler will close the table and finish when the outstanding
  2449. inserts are done.
  2450. */
  2451. if (! (thd->lock= mysql_lock_tables(thd, &di->table, 1, 0)))
  2452. {
  2453. /* Fatal error */
  2454. thd->killed= THD::KILL_CONNECTION;
  2455. }
  2456. mysql_cond_broadcast(&di->cond_client);
  2457. }
  2458. if (di->stacked_inserts)
  2459. {
  2460. if (di->handle_inserts())
  2461. {
  2462. /* Some fatal error */
  2463. thd->killed= THD::KILL_CONNECTION;
  2464. }
  2465. }
  2466. di->status=0;
  2467. if (!di->stacked_inserts && !di->tables_in_use && thd->lock)
  2468. {
  2469. /*
  2470. No one is doing a insert delayed
  2471. Unlock table so that other threads can use it
  2472. */
  2473. MYSQL_LOCK *lock=thd->lock;
  2474. thd->lock=0;
  2475. mysql_mutex_unlock(&di->mutex);
  2476. /*
  2477. We need to release next_insert_id before unlocking. This is
  2478. enforced by handler::ha_external_lock().
  2479. */
  2480. di->table->file->ha_release_auto_increment();
  2481. mysql_unlock_tables(thd, lock);
  2482. trans_commit_stmt(thd);
  2483. di->group_count=0;
  2484. mysql_audit_release(thd);
  2485. mysql_mutex_lock(&di->mutex);
  2486. }
  2487. if (di->tables_in_use)
  2488. mysql_cond_broadcast(&di->cond_client); // If waiting clients
  2489. }
  2490. err:
  2491. DBUG_LEAVE;
  2492. }
  2493. close_thread_tables(thd); // Free the table
  2494. thd->mdl_context.release_transactional_locks();
  2495. di->table=0;
  2496. thd->killed= THD::KILL_CONNECTION; // If error
  2497. mysql_cond_broadcast(&di->cond_client); // Safety
  2498. mysql_mutex_unlock(&di->mutex);
  2499. mysql_mutex_lock(&LOCK_delayed_create); // Because of delayed_get_table
  2500. mysql_mutex_lock(&LOCK_delayed_insert);
  2501. /*
  2502. di should be unlinked from the thread handler list and have no active
  2503. clients
  2504. */
  2505. delete di;
  2506. mysql_mutex_unlock(&LOCK_delayed_insert);
  2507. mysql_mutex_unlock(&LOCK_delayed_create);
  2508. my_thread_end();
  2509. pthread_exit(0);
  2510. return 0;
  2511. }
  2512. /* Remove pointers from temporary fields to allocated values */
  2513. static void unlink_blobs(register TABLE *table)
  2514. {
  2515. for (Field **ptr=table->field ; *ptr ; ptr++)
  2516. {
  2517. if ((*ptr)->flags & BLOB_FLAG)
  2518. ((Field_blob *) (*ptr))->clear_temporary();
  2519. }
  2520. }
  2521. /* Free blobs stored in current row */
  2522. static void free_delayed_insert_blobs(register TABLE *table)
  2523. {
  2524. for (Field **ptr=table->field ; *ptr ; ptr++)
  2525. {
  2526. if ((*ptr)->flags & BLOB_FLAG)
  2527. {
  2528. uchar *str;
  2529. ((Field_blob *) (*ptr))->get_ptr(&str);
  2530. my_free(str);
  2531. ((Field_blob *) (*ptr))->reset();
  2532. }
  2533. }
  2534. }
  2535. bool Delayed_insert::handle_inserts(void)
  2536. {
  2537. int error;
  2538. ulong max_rows;
  2539. bool has_trans = TRUE;
  2540. bool using_ignore= 0, using_opt_replace= 0,
  2541. using_bin_log= mysql_bin_log.is_open();
  2542. delayed_row *row;
  2543. DBUG_ENTER("handle_inserts");
  2544. /* Allow client to insert new rows */
  2545. mysql_mutex_unlock(&mutex);
  2546. table->next_number_field=table->found_next_number_field;
  2547. table->use_all_columns();
  2548. thd_proc_info(&thd, "upgrading lock");
  2549. if (thr_upgrade_write_delay_lock(*thd.lock->locks, delayed_lock,
  2550. thd.variables.lock_wait_timeout))
  2551. {
  2552. /*
  2553. This can happen if thread is killed either by a shutdown
  2554. or if another thread is removing the current table definition
  2555. from the table cache.
  2556. */
  2557. my_error(ER_DELAYED_CANT_CHANGE_LOCK,MYF(ME_FATALERROR),
  2558. table->s->table_name.str);
  2559. goto err;
  2560. }
  2561. thd_proc_info(&thd, "insert");
  2562. max_rows= delayed_insert_limit;
  2563. if (thd.killed || table->s->has_old_version())
  2564. {
  2565. thd.killed= THD::KILL_CONNECTION;
  2566. max_rows= ULONG_MAX; // Do as much as possible
  2567. }
  2568. /*
  2569. We can't use row caching when using the binary log because if
  2570. we get a crash, then binary log will contain rows that are not yet
  2571. written to disk, which will cause problems in replication.
  2572. */
  2573. if (!using_bin_log)
  2574. table->file->extra(HA_EXTRA_WRITE_CACHE);
  2575. mysql_mutex_lock(&mutex);
  2576. while ((row=rows.get()))
  2577. {
  2578. stacked_inserts--;
  2579. mysql_mutex_unlock(&mutex);
  2580. memcpy(table->record[0],row->record,table->s->reclength);
  2581. thd.start_time=row->start_time;
  2582. thd.query_start_used=row->query_start_used;
  2583. /*
  2584. To get the exact auto_inc interval to store in the binlog we must not
  2585. use values from the previous interval (of the previous rows).
  2586. */
  2587. bool log_query= (row->log_query && row->query.str != NULL);
  2588. DBUG_PRINT("delayed", ("query: '%s' length: %lu", row->query.str ?
  2589. row->query.str : "[NULL]",
  2590. (ulong) row->query.length));
  2591. if (log_query)
  2592. {
  2593. /*
  2594. This is the first value of an INSERT statement.
  2595. It is the right place to clear a forced insert_id.
  2596. This is usually done after the last value of an INSERT statement,
  2597. but we won't know this in the insert delayed thread. But before
  2598. the first value is sufficiently equivalent to after the last
  2599. value of the previous statement.
  2600. */
  2601. table->file->ha_release_auto_increment();
  2602. thd.auto_inc_intervals_in_cur_stmt_for_binlog.empty();
  2603. }
  2604. thd.first_successful_insert_id_in_prev_stmt=
  2605. row->first_successful_insert_id_in_prev_stmt;
  2606. thd.stmt_depends_on_first_successful_insert_id_in_prev_stmt=
  2607. row->stmt_depends_on_first_successful_insert_id_in_prev_stmt;
  2608. table->timestamp_field_type= row->timestamp_field_type;
  2609. table->auto_increment_field_not_null= row->auto_increment_field_not_null;
  2610. /* Copy the session variables. */
  2611. thd.variables.auto_increment_increment= row->auto_increment_increment;
  2612. thd.variables.auto_increment_offset= row->auto_increment_offset;
  2613. thd.variables.sql_mode= row->sql_mode;
  2614. /* Copy a forced insert_id, if any. */
  2615. if (row->forced_insert_id)
  2616. {
  2617. DBUG_PRINT("delayed", ("received auto_inc: %lu",
  2618. (ulong) row->forced_insert_id));
  2619. thd.force_one_auto_inc_interval(row->forced_insert_id);
  2620. }
  2621. info.ignore= row->ignore;
  2622. info.handle_duplicates= row->dup;
  2623. if (info.ignore ||
  2624. info.handle_duplicates != DUP_ERROR)
  2625. {
  2626. table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
  2627. using_ignore=1;
  2628. }
  2629. if (info.handle_duplicates == DUP_REPLACE &&
  2630. (!table->triggers ||
  2631. !table->triggers->has_delete_triggers()))
  2632. {
  2633. table->file->extra(HA_EXTRA_WRITE_CAN_REPLACE);
  2634. using_opt_replace= 1;
  2635. }
  2636. if (info.handle_duplicates == DUP_UPDATE)
  2637. table->file->extra(HA_EXTRA_INSERT_WITH_UPDATE);
  2638. thd.clear_error(); // reset error for binlog
  2639. if (write_record(&thd, table, &info))
  2640. {
  2641. info.error_count++; // Ignore errors
  2642. thread_safe_increment(delayed_insert_errors,&LOCK_delayed_status);
  2643. row->log_query = 0;
  2644. }
  2645. if (using_ignore)
  2646. {
  2647. using_ignore=0;
  2648. table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
  2649. }
  2650. if (using_opt_replace)
  2651. {
  2652. using_opt_replace= 0;
  2653. table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
  2654. }
  2655. if (log_query && mysql_bin_log.is_open())
  2656. {
  2657. bool backup_time_zone_used = thd.time_zone_used;
  2658. Time_zone *backup_time_zone = thd.variables.time_zone;
  2659. if (row->time_zone != NULL)
  2660. {
  2661. thd.time_zone_used = true;
  2662. thd.variables.time_zone = row->time_zone;
  2663. }
  2664. /* if the delayed insert was killed, the killed status is
  2665. ignored while binlogging */
  2666. int errcode= 0;
  2667. if (thd.killed == THD::NOT_KILLED)
  2668. errcode= query_error_code(&thd, TRUE);
  2669. /*
  2670. If the query has several rows to insert, only the first row will come
  2671. here. In row-based binlogging, this means that the first row will be
  2672. written to binlog as one Table_map event and one Rows event (due to an
  2673. event flush done in binlog_query()), then all other rows of this query
  2674. will be binlogged together as one single Table_map event and one
  2675. single Rows event.
  2676. */
  2677. if (thd.binlog_query(THD::ROW_QUERY_TYPE,
  2678. row->query.str, row->query.length,
  2679. FALSE, FALSE, FALSE, errcode))
  2680. goto err;
  2681. thd.time_zone_used = backup_time_zone_used;
  2682. thd.variables.time_zone = backup_time_zone;
  2683. }
  2684. if (table->s->blob_fields)
  2685. free_delayed_insert_blobs(table);
  2686. thread_safe_decrement(delayed_rows_in_use,&LOCK_delayed_status);
  2687. thread_safe_increment(delayed_insert_writes,&LOCK_delayed_status);
  2688. mysql_mutex_lock(&mutex);
  2689. /*
  2690. Reset the table->auto_increment_field_not_null as it is valid for
  2691. only one row.
  2692. */
  2693. table->auto_increment_field_not_null= FALSE;
  2694. delete row;
  2695. /*
  2696. Let READ clients do something once in a while
  2697. We should however not break in the middle of a multi-line insert
  2698. if we have binary logging enabled as we don't want other commands
  2699. on this table until all entries has been processed
  2700. */
  2701. if (group_count++ >= max_rows && (row= rows.head()) &&
  2702. (!(row->log_query & using_bin_log)))
  2703. {
  2704. group_count=0;
  2705. if (stacked_inserts || tables_in_use) // Let these wait a while
  2706. {
  2707. if (tables_in_use)
  2708. mysql_cond_broadcast(&cond_client); // If waiting clients
  2709. thd_proc_info(&thd, "reschedule");
  2710. mysql_mutex_unlock(&mutex);
  2711. if ((error=table->file->extra(HA_EXTRA_NO_CACHE)))
  2712. {
  2713. /* This should never happen */
  2714. table->file->print_error(error,MYF(0));
  2715. sql_print_error("%s", thd.stmt_da->message());
  2716. DBUG_PRINT("error", ("HA_EXTRA_NO_CACHE failed in loop"));
  2717. goto err;
  2718. }
  2719. query_cache_invalidate3(&thd, table, 1);
  2720. if (thr_reschedule_write_lock(*thd.lock->locks,
  2721. thd.variables.lock_wait_timeout))
  2722. {
  2723. /* This is not known to happen. */
  2724. my_error(ER_DELAYED_CANT_CHANGE_LOCK,MYF(ME_FATALERROR),
  2725. table->s->table_name.str);
  2726. goto err;
  2727. }
  2728. if (!using_bin_log)
  2729. table->file->extra(HA_EXTRA_WRITE_CACHE);
  2730. mysql_mutex_lock(&mutex);
  2731. thd_proc_info(&thd, "insert");
  2732. }
  2733. if (tables_in_use)
  2734. mysql_cond_broadcast(&cond_client); // If waiting clients
  2735. }
  2736. }
  2737. thd_proc_info(&thd, 0);
  2738. mysql_mutex_unlock(&mutex);
  2739. /*
  2740. We need to flush the pending event when using row-based
  2741. replication since the flushing normally done in binlog_query() is
  2742. not done last in the statement: for delayed inserts, the insert
  2743. statement is logged *before* all rows are inserted.
  2744. We can flush the pending event without checking the thd->lock
  2745. since the delayed insert *thread* is not inside a stored function
  2746. or trigger.
  2747. TODO: Move the logging to last in the sequence of rows.
  2748. */
  2749. has_trans= thd.lex->sql_command == SQLCOM_CREATE_TABLE ||
  2750. table->file->has_transactions();
  2751. if (thd.is_current_stmt_binlog_format_row() &&
  2752. thd.binlog_flush_pending_rows_event(TRUE, has_trans))
  2753. goto err;
  2754. if ((error=table->file->extra(HA_EXTRA_NO_CACHE)))
  2755. { // This shouldn't happen
  2756. table->file->print_error(error,MYF(0));
  2757. sql_print_error("%s", thd.stmt_da->message());
  2758. DBUG_PRINT("error", ("HA_EXTRA_NO_CACHE failed after loop"));
  2759. goto err;
  2760. }
  2761. query_cache_invalidate3(&thd, table, 1);
  2762. mysql_mutex_lock(&mutex);
  2763. DBUG_RETURN(0);
  2764. err:
  2765. #ifndef DBUG_OFF
  2766. max_rows= 0; // For DBUG output
  2767. #endif
  2768. /* Remove all not used rows */
  2769. while ((row=rows.get()))
  2770. {
  2771. if (table->s->blob_fields)
  2772. {
  2773. memcpy(table->record[0],row->record,table->s->reclength);
  2774. free_delayed_insert_blobs(table);
  2775. }
  2776. delete row;
  2777. thread_safe_increment(delayed_insert_errors,&LOCK_delayed_status);
  2778. stacked_inserts--;
  2779. #ifndef DBUG_OFF
  2780. max_rows++;
  2781. #endif
  2782. }
  2783. DBUG_PRINT("error", ("dropped %lu rows after an error", max_rows));
  2784. thread_safe_increment(delayed_insert_errors, &LOCK_delayed_status);
  2785. mysql_mutex_lock(&mutex);
  2786. DBUG_RETURN(1);
  2787. }
  2788. #endif /* EMBEDDED_LIBRARY */
  2789. /***************************************************************************
  2790. Store records in INSERT ... SELECT *
  2791. ***************************************************************************/
  2792. /*
  2793. make insert specific preparation and checks after opening tables
  2794. SYNOPSIS
  2795. mysql_insert_select_prepare()
  2796. thd thread handler
  2797. RETURN
  2798. FALSE OK
  2799. TRUE Error
  2800. */
  2801. bool mysql_insert_select_prepare(THD *thd)
  2802. {
  2803. LEX *lex= thd->lex;
  2804. SELECT_LEX *select_lex= &lex->select_lex;
  2805. TABLE_LIST *first_select_leaf_table;
  2806. DBUG_ENTER("mysql_insert_select_prepare");
  2807. /*
  2808. SELECT_LEX do not belong to INSERT statement, so we can't add WHERE
  2809. clause if table is VIEW
  2810. */
  2811. if (mysql_prepare_insert(thd, lex->query_tables,
  2812. lex->query_tables->table, lex->field_list, 0,
  2813. lex->update_list, lex->value_list,
  2814. lex->duplicates,
  2815. &select_lex->where, TRUE, FALSE, FALSE))
  2816. DBUG_RETURN(TRUE);
  2817. /*
  2818. exclude first table from leaf tables list, because it belong to
  2819. INSERT
  2820. */
  2821. DBUG_ASSERT(select_lex->leaf_tables != 0);
  2822. lex->leaf_tables_insert= select_lex->leaf_tables;
  2823. /* skip all leaf tables belonged to view where we are insert */
  2824. for (first_select_leaf_table= select_lex->leaf_tables->next_leaf;
  2825. first_select_leaf_table &&
  2826. first_select_leaf_table->belong_to_view &&
  2827. first_select_leaf_table->belong_to_view ==
  2828. lex->leaf_tables_insert->belong_to_view;
  2829. first_select_leaf_table= first_select_leaf_table->next_leaf)
  2830. {}
  2831. select_lex->leaf_tables= first_select_leaf_table;
  2832. DBUG_RETURN(FALSE);
  2833. }
  2834. select_insert::select_insert(TABLE_LIST *table_list_par, TABLE *table_par,
  2835. List<Item> *fields_par,
  2836. List<Item> *update_fields,
  2837. List<Item> *update_values,
  2838. enum_duplicates duplic,
  2839. bool ignore_check_option_errors)
  2840. :table_list(table_list_par), table(table_par), fields(fields_par),
  2841. autoinc_value_of_last_inserted_row(0),
  2842. insert_into_view(table_list_par && table_list_par->view != 0)
  2843. {
  2844. bzero((char*) &info,sizeof(info));
  2845. info.handle_duplicates= duplic;
  2846. info.ignore= ignore_check_option_errors;
  2847. info.update_fields= update_fields;
  2848. info.update_values= update_values;
  2849. if (table_list_par)
  2850. info.view= (table_list_par->view ? table_list_par : 0);
  2851. }
  2852. int
  2853. select_insert::prepare(List<Item> &values, SELECT_LEX_UNIT *u)
  2854. {
  2855. LEX *lex= thd->lex;
  2856. int res;
  2857. table_map map= 0;
  2858. SELECT_LEX *lex_current_select_save= lex->current_select;
  2859. DBUG_ENTER("select_insert::prepare");
  2860. unit= u;
  2861. /*
  2862. Since table in which we are going to insert is added to the first
  2863. select, LEX::current_select should point to the first select while
  2864. we are fixing fields from insert list.
  2865. */
  2866. lex->current_select= &lex->select_lex;
  2867. /* Errors during check_insert_fields() should not be ignored. */
  2868. lex->current_select->no_error= FALSE;
  2869. res= (setup_fields(thd, 0, values, MARK_COLUMNS_READ, 0, 0) ||
  2870. check_insert_fields(thd, table_list, *fields, values,
  2871. !insert_into_view, 1, &map));
  2872. if (!res && fields->elements)
  2873. {
  2874. bool saved_abort_on_warning= thd->abort_on_warning;
  2875. thd->abort_on_warning= !info.ignore && (thd->variables.sql_mode &
  2876. (MODE_STRICT_TRANS_TABLES |
  2877. MODE_STRICT_ALL_TABLES));
  2878. res= check_that_all_fields_are_given_values(thd, table_list->table,
  2879. table_list);
  2880. thd->abort_on_warning= saved_abort_on_warning;
  2881. }
  2882. if (info.handle_duplicates == DUP_UPDATE && !res)
  2883. {
  2884. Name_resolution_context *context= &lex->select_lex.context;
  2885. Name_resolution_context_state ctx_state;
  2886. /* Save the state of the current name resolution context. */
  2887. ctx_state.save_state(context, table_list);
  2888. /* Perform name resolution only in the first table - 'table_list'. */
  2889. table_list->next_local= 0;
  2890. context->resolve_in_table_list_only(table_list);
  2891. lex->select_lex.no_wrap_view_item= TRUE;
  2892. res= res || check_update_fields(thd, context->table_list,
  2893. *info.update_fields, *info.update_values,
  2894. &map);
  2895. lex->select_lex.no_wrap_view_item= FALSE;
  2896. /*
  2897. When we are not using GROUP BY and there are no ungrouped aggregate functions
  2898. we can refer to other tables in the ON DUPLICATE KEY part.
  2899. We use next_name_resolution_table descructively, so check it first (views?)
  2900. */
  2901. DBUG_ASSERT (!table_list->next_name_resolution_table);
  2902. if (lex->select_lex.group_list.elements == 0 &&
  2903. !lex->select_lex.with_sum_func)
  2904. /*
  2905. We must make a single context out of the two separate name resolution contexts :
  2906. the INSERT table and the tables in the SELECT part of INSERT ... SELECT.
  2907. To do that we must concatenate the two lists
  2908. */
  2909. table_list->next_name_resolution_table=
  2910. ctx_state.get_first_name_resolution_table();
  2911. res= res || setup_fields(thd, 0, *info.update_values,
  2912. MARK_COLUMNS_READ, 0, 0);
  2913. if (!res)
  2914. {
  2915. /*
  2916. Traverse the update values list and substitute fields from the
  2917. select for references (Item_ref objects) to them. This is done in
  2918. order to get correct values from those fields when the select
  2919. employs a temporary table.
  2920. */
  2921. List_iterator<Item> li(*info.update_values);
  2922. Item *item;
  2923. while ((item= li++))
  2924. {
  2925. item->transform(&Item::update_value_transformer,
  2926. (uchar*)lex->current_select);
  2927. }
  2928. }
  2929. /* Restore the current context. */
  2930. ctx_state.restore_state(context, table_list);
  2931. }
  2932. lex->current_select= lex_current_select_save;
  2933. if (res)
  2934. DBUG_RETURN(1);
  2935. /*
  2936. if it is INSERT into join view then check_insert_fields already found
  2937. real table for insert
  2938. */
  2939. table= table_list->table;
  2940. /*
  2941. Is table which we are changing used somewhere in other parts of
  2942. query
  2943. */
  2944. if (unique_table(thd, table_list, table_list->next_global, 0))
  2945. {
  2946. /* Using same table for INSERT and SELECT */
  2947. lex->current_select->options|= OPTION_BUFFER_RESULT;
  2948. lex->current_select->join->select_options|= OPTION_BUFFER_RESULT;
  2949. }
  2950. else if (!(lex->current_select->options & OPTION_BUFFER_RESULT) &&
  2951. thd->locked_tables_mode <= LTM_LOCK_TABLES)
  2952. {
  2953. /*
  2954. We must not yet prepare the result table if it is the same as one of the
  2955. source tables (INSERT SELECT). The preparation may disable
  2956. indexes on the result table, which may be used during the select, if it
  2957. is the same table (Bug #6034). Do the preparation after the select phase
  2958. in select_insert::prepare2().
  2959. We won't start bulk inserts at all if this statement uses functions or
  2960. should invoke triggers since they may access to the same table too.
  2961. */
  2962. table->file->ha_start_bulk_insert((ha_rows) 0);
  2963. }
  2964. restore_record(table,s->default_values); // Get empty record
  2965. table->next_number_field=table->found_next_number_field;
  2966. #ifdef HAVE_REPLICATION
  2967. if (thd->slave_thread &&
  2968. (info.handle_duplicates == DUP_UPDATE) &&
  2969. (table->next_number_field != NULL) &&
  2970. rpl_master_has_bug(&active_mi->rli, 24432, TRUE, NULL, NULL))
  2971. DBUG_RETURN(1);
  2972. #endif
  2973. thd->cuted_fields=0;
  2974. if (info.ignore || info.handle_duplicates != DUP_ERROR)
  2975. table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
  2976. if (info.handle_duplicates == DUP_REPLACE &&
  2977. (!table->triggers || !table->triggers->has_delete_triggers()))
  2978. table->file->extra(HA_EXTRA_WRITE_CAN_REPLACE);
  2979. if (info.handle_duplicates == DUP_UPDATE)
  2980. table->file->extra(HA_EXTRA_INSERT_WITH_UPDATE);
  2981. thd->abort_on_warning= (!info.ignore &&
  2982. (thd->variables.sql_mode &
  2983. (MODE_STRICT_TRANS_TABLES |
  2984. MODE_STRICT_ALL_TABLES)));
  2985. res= (table_list->prepare_where(thd, 0, TRUE) ||
  2986. table_list->prepare_check_option(thd));
  2987. if (!res)
  2988. prepare_triggers_for_insert_stmt(table);
  2989. DBUG_RETURN(res);
  2990. }
  2991. /*
  2992. Finish the preparation of the result table.
  2993. SYNOPSIS
  2994. select_insert::prepare2()
  2995. void
  2996. DESCRIPTION
  2997. If the result table is the same as one of the source tables (INSERT SELECT),
  2998. the result table is not finally prepared at the join prepair phase.
  2999. Do the final preparation now.
  3000. RETURN
  3001. 0 OK
  3002. */
  3003. int select_insert::prepare2(void)
  3004. {
  3005. DBUG_ENTER("select_insert::prepare2");
  3006. if (thd->lex->current_select->options & OPTION_BUFFER_RESULT &&
  3007. thd->locked_tables_mode <= LTM_LOCK_TABLES)
  3008. table->file->ha_start_bulk_insert((ha_rows) 0);
  3009. DBUG_RETURN(0);
  3010. }
  3011. void select_insert::cleanup()
  3012. {
  3013. /* select_insert/select_create are never re-used in prepared statement */
  3014. DBUG_ASSERT(0);
  3015. }
  3016. select_insert::~select_insert()
  3017. {
  3018. DBUG_ENTER("~select_insert");
  3019. if (table)
  3020. {
  3021. table->next_number_field=0;
  3022. table->auto_increment_field_not_null= FALSE;
  3023. table->file->ha_reset();
  3024. }
  3025. thd->count_cuted_fields= CHECK_FIELD_IGNORE;
  3026. thd->abort_on_warning= 0;
  3027. DBUG_VOID_RETURN;
  3028. }
  3029. bool select_insert::send_data(List<Item> &values)
  3030. {
  3031. DBUG_ENTER("select_insert::send_data");
  3032. bool error=0;
  3033. if (unit->offset_limit_cnt)
  3034. { // using limit offset,count
  3035. unit->offset_limit_cnt--;
  3036. DBUG_RETURN(0);
  3037. }
  3038. thd->count_cuted_fields= CHECK_FIELD_WARN; // Calculate cuted fields
  3039. store_values(values);
  3040. thd->count_cuted_fields= CHECK_FIELD_ERROR_FOR_NULL;
  3041. if (thd->is_error())
  3042. {
  3043. table->auto_increment_field_not_null= FALSE;
  3044. DBUG_RETURN(1);
  3045. }
  3046. if (table_list) // Not CREATE ... SELECT
  3047. {
  3048. switch (table_list->view_check_option(thd, info.ignore)) {
  3049. case VIEW_CHECK_SKIP:
  3050. DBUG_RETURN(0);
  3051. case VIEW_CHECK_ERROR:
  3052. DBUG_RETURN(1);
  3053. }
  3054. }
  3055. // Release latches in case bulk insert takes a long time
  3056. ha_release_temporary_latches(thd);
  3057. error= write_record(thd, table, &info);
  3058. table->auto_increment_field_not_null= FALSE;
  3059. if (!error)
  3060. {
  3061. if (table->triggers || info.handle_duplicates == DUP_UPDATE)
  3062. {
  3063. /*
  3064. Restore fields of the record since it is possible that they were
  3065. changed by ON DUPLICATE KEY UPDATE clause.
  3066. If triggers exist then whey can modify some fields which were not
  3067. originally touched by INSERT ... SELECT, so we have to restore
  3068. their original values for the next row.
  3069. */
  3070. restore_record(table, s->default_values);
  3071. }
  3072. if (table->next_number_field)
  3073. {
  3074. /*
  3075. If no value has been autogenerated so far, we need to remember the
  3076. value we just saw, we may need to send it to client in the end.
  3077. */
  3078. if (thd->first_successful_insert_id_in_cur_stmt == 0) // optimization
  3079. autoinc_value_of_last_inserted_row=
  3080. table->next_number_field->val_int();
  3081. /*
  3082. Clear auto-increment field for the next record, if triggers are used
  3083. we will clear it twice, but this should be cheap.
  3084. */
  3085. table->next_number_field->reset();
  3086. }
  3087. }
  3088. DBUG_RETURN(error);
  3089. }
  3090. void select_insert::store_values(List<Item> &values)
  3091. {
  3092. if (fields->elements)
  3093. fill_record_n_invoke_before_triggers(thd, *fields, values, 1,
  3094. table->triggers, TRG_EVENT_INSERT);
  3095. else
  3096. fill_record_n_invoke_before_triggers(thd, table->field, values, 1,
  3097. table->triggers, TRG_EVENT_INSERT);
  3098. }
  3099. void select_insert::send_error(uint errcode,const char *err)
  3100. {
  3101. DBUG_ENTER("select_insert::send_error");
  3102. my_message(errcode, err, MYF(0));
  3103. DBUG_VOID_RETURN;
  3104. }
  3105. bool select_insert::send_eof()
  3106. {
  3107. int error;
  3108. bool const trans_table= table->file->has_transactions();
  3109. ulonglong id, row_count;
  3110. bool changed;
  3111. THD::killed_state killed_status= thd->killed;
  3112. DBUG_ENTER("select_insert::send_eof");
  3113. DBUG_PRINT("enter", ("trans_table=%d, table_type='%s'",
  3114. trans_table, table->file->table_type()));
  3115. error= (thd->locked_tables_mode <= LTM_LOCK_TABLES ?
  3116. table->file->ha_end_bulk_insert() : 0);
  3117. table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
  3118. table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
  3119. changed= (info.copied || info.deleted || info.updated);
  3120. if (changed)
  3121. {
  3122. /*
  3123. We must invalidate the table in the query cache before binlog writing
  3124. and ha_autocommit_or_rollback.
  3125. */
  3126. query_cache_invalidate3(thd, table, 1);
  3127. }
  3128. if (thd->transaction.stmt.modified_non_trans_table)
  3129. thd->transaction.all.modified_non_trans_table= TRUE;
  3130. DBUG_ASSERT(trans_table || !changed ||
  3131. thd->transaction.stmt.modified_non_trans_table);
  3132. /*
  3133. Write to binlog before commiting transaction. No statement will
  3134. be written by the binlog_query() below in RBR mode. All the
  3135. events are in the transaction cache and will be written when
  3136. ha_autocommit_or_rollback() is issued below.
  3137. */
  3138. if (mysql_bin_log.is_open() &&
  3139. (!error || thd->transaction.stmt.modified_non_trans_table))
  3140. {
  3141. int errcode= 0;
  3142. if (!error)
  3143. thd->clear_error();
  3144. else
  3145. errcode= query_error_code(thd, killed_status == THD::NOT_KILLED);
  3146. if (thd->binlog_query(THD::ROW_QUERY_TYPE,
  3147. thd->query(), thd->query_length(),
  3148. trans_table, FALSE, FALSE, errcode))
  3149. {
  3150. table->file->ha_release_auto_increment();
  3151. DBUG_RETURN(1);
  3152. }
  3153. }
  3154. table->file->ha_release_auto_increment();
  3155. if (error)
  3156. {
  3157. table->file->print_error(error,MYF(0));
  3158. DBUG_RETURN(1);
  3159. }
  3160. char buff[160];
  3161. if (info.ignore)
  3162. sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
  3163. (ulong) (info.records - info.copied),
  3164. (ulong) thd->warning_info->statement_warn_count());
  3165. else
  3166. sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
  3167. (ulong) (info.deleted+info.updated),
  3168. (ulong) thd->warning_info->statement_warn_count());
  3169. row_count= info.copied + info.deleted +
  3170. ((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
  3171. info.touched : info.updated);
  3172. id= (thd->first_successful_insert_id_in_cur_stmt > 0) ?
  3173. thd->first_successful_insert_id_in_cur_stmt :
  3174. (thd->arg_of_last_insert_id_function ?
  3175. thd->first_successful_insert_id_in_prev_stmt :
  3176. (info.copied ? autoinc_value_of_last_inserted_row : 0));
  3177. ::my_ok(thd, row_count, id, buff);
  3178. DBUG_RETURN(0);
  3179. }
  3180. void select_insert::abort_result_set() {
  3181. DBUG_ENTER("select_insert::abort_result_set");
  3182. /*
  3183. If the creation of the table failed (due to a syntax error, for
  3184. example), no table will have been opened and therefore 'table'
  3185. will be NULL. In that case, we still need to execute the rollback
  3186. and the end of the function.
  3187. */
  3188. if (table)
  3189. {
  3190. bool changed, transactional_table;
  3191. /*
  3192. If we are not in prelocked mode, we end the bulk insert started
  3193. before.
  3194. */
  3195. if (thd->locked_tables_mode <= LTM_LOCK_TABLES)
  3196. table->file->ha_end_bulk_insert();
  3197. /*
  3198. If at least one row has been inserted/modified and will stay in
  3199. the table (the table doesn't have transactions) we must write to
  3200. the binlog (and the error code will make the slave stop).
  3201. For many errors (example: we got a duplicate key error while
  3202. inserting into a MyISAM table), no row will be added to the table,
  3203. so passing the error to the slave will not help since there will
  3204. be an error code mismatch (the inserts will succeed on the slave
  3205. with no error).
  3206. If table creation failed, the number of rows modified will also be
  3207. zero, so no check for that is made.
  3208. */
  3209. changed= (info.copied || info.deleted || info.updated);
  3210. transactional_table= table->file->has_transactions();
  3211. if (thd->transaction.stmt.modified_non_trans_table)
  3212. {
  3213. if (!can_rollback_data())
  3214. thd->transaction.all.modified_non_trans_table= TRUE;
  3215. if (mysql_bin_log.is_open())
  3216. {
  3217. int errcode= query_error_code(thd, thd->killed == THD::NOT_KILLED);
  3218. /* error of writing binary log is ignored */
  3219. (void) thd->binlog_query(THD::ROW_QUERY_TYPE, thd->query(),
  3220. thd->query_length(),
  3221. transactional_table, FALSE, FALSE, errcode);
  3222. }
  3223. if (changed)
  3224. query_cache_invalidate3(thd, table, 1);
  3225. }
  3226. DBUG_ASSERT(transactional_table || !changed ||
  3227. thd->transaction.stmt.modified_non_trans_table);
  3228. table->file->ha_release_auto_increment();
  3229. }
  3230. DBUG_VOID_RETURN;
  3231. }
  3232. /***************************************************************************
  3233. CREATE TABLE (SELECT) ...
  3234. ***************************************************************************/
  3235. /**
  3236. Create table from lists of fields and items (or just return TABLE
  3237. object for pre-opened existing table).
  3238. @param thd [in] Thread object
  3239. @param create_info [in] Create information (like MAX_ROWS, ENGINE or
  3240. temporary table flag)
  3241. @param create_table [in] Pointer to TABLE_LIST object providing database
  3242. and name for table to be created or to be open
  3243. @param alter_info [in/out] Initial list of columns and indexes for the
  3244. table to be created
  3245. @param items [in] List of items which should be used to produce
  3246. rest of fields for the table (corresponding
  3247. fields will be added to the end of
  3248. alter_info->create_list)
  3249. @param lock [out] Pointer to the MYSQL_LOCK object for table
  3250. created will be returned in this parameter.
  3251. Since this table is not included in THD::lock
  3252. caller is responsible for explicitly unlocking
  3253. this table.
  3254. @param hooks [in] Hooks to be invoked before and after obtaining
  3255. table lock on the table being created.
  3256. @note
  3257. This function assumes that either table exists and was pre-opened and
  3258. locked at open_and_lock_tables() stage (and in this case we just emit
  3259. error or warning and return pre-opened TABLE object) or an exclusive
  3260. metadata lock was acquired on table so we can safely create, open and
  3261. lock table in it (we don't acquire metadata lock if this create is
  3262. for temporary table).
  3263. @note
  3264. Since this function contains some logic specific to CREATE TABLE ...
  3265. SELECT it should be changed before it can be used in other contexts.
  3266. @retval non-zero Pointer to TABLE object for table created or opened
  3267. @retval 0 Error
  3268. */
  3269. static TABLE *create_table_from_items(THD *thd, HA_CREATE_INFO *create_info,
  3270. TABLE_LIST *create_table,
  3271. Alter_info *alter_info,
  3272. List<Item> *items,
  3273. MYSQL_LOCK **lock,
  3274. TABLEOP_HOOKS *hooks)
  3275. {
  3276. TABLE tmp_table; // Used during 'Create_field()'
  3277. TABLE_SHARE share;
  3278. TABLE *table= 0;
  3279. uint select_field_count= items->elements;
  3280. /* Add selected items to field list */
  3281. List_iterator_fast<Item> it(*items);
  3282. Item *item;
  3283. Field *tmp_field;
  3284. DBUG_ENTER("create_table_from_items");
  3285. tmp_table.alias= 0;
  3286. tmp_table.timestamp_field= 0;
  3287. tmp_table.s= &share;
  3288. init_tmp_table_share(thd, &share, "", 0, "", "");
  3289. tmp_table.s->db_create_options=0;
  3290. tmp_table.s->blob_ptr_size= portable_sizeof_char_ptr;
  3291. tmp_table.s->db_low_byte_first=
  3292. test(create_info->db_type == myisam_hton ||
  3293. create_info->db_type == heap_hton);
  3294. tmp_table.null_row=tmp_table.maybe_null=0;
  3295. while ((item=it++))
  3296. {
  3297. Create_field *cr_field;
  3298. Field *field, *def_field;
  3299. if (item->type() == Item::FUNC_ITEM)
  3300. if (item->result_type() != STRING_RESULT)
  3301. field= item->tmp_table_field(&tmp_table);
  3302. else
  3303. field= item->tmp_table_field_from_field_type(&tmp_table, 0);
  3304. else
  3305. field= create_tmp_field(thd, &tmp_table, item, item->type(),
  3306. (Item ***) 0, &tmp_field, &def_field, 0, 0, 0, 0,
  3307. 0);
  3308. if (!field ||
  3309. !(cr_field=new Create_field(field,(item->type() == Item::FIELD_ITEM ?
  3310. ((Item_field *)item)->field :
  3311. (Field*) 0))))
  3312. DBUG_RETURN(0);
  3313. if (item->maybe_null)
  3314. cr_field->flags &= ~NOT_NULL_FLAG;
  3315. alter_info->create_list.push_back(cr_field);
  3316. }
  3317. DBUG_EXECUTE_IF("sleep_create_select_before_create", my_sleep(6000000););
  3318. /*
  3319. Create and lock table.
  3320. Note that we either creating (or opening existing) temporary table or
  3321. creating base table on which name we have exclusive lock. So code below
  3322. should not cause deadlocks or races.
  3323. We don't log the statement, it will be logged later.
  3324. If this is a HEAP table, the automatic DELETE FROM which is written to the
  3325. binlog when a HEAP table is opened for the first time since startup, must
  3326. not be written: 1) it would be wrong (imagine we're in CREATE SELECT: we
  3327. don't want to delete from it) 2) it would be written before the CREATE
  3328. TABLE, which is a wrong order. So we keep binary logging disabled when we
  3329. open_table().
  3330. */
  3331. {
  3332. if (!mysql_create_table_no_lock(thd, create_table->db,
  3333. create_table->table_name,
  3334. create_info, alter_info, 0,
  3335. select_field_count, NULL))
  3336. {
  3337. DBUG_EXECUTE_IF("sleep_create_select_before_open", my_sleep(6000000););
  3338. if (!(create_info->options & HA_LEX_CREATE_TMP_TABLE))
  3339. {
  3340. Open_table_context ot_ctx(thd, MYSQL_OPEN_REOPEN);
  3341. /*
  3342. Here we open the destination table, on which we already have
  3343. an exclusive metadata lock.
  3344. */
  3345. if (open_table(thd, create_table, thd->mem_root, &ot_ctx))
  3346. {
  3347. quick_rm_table(create_info->db_type, create_table->db,
  3348. table_case_name(create_info, create_table->table_name),
  3349. 0);
  3350. }
  3351. else
  3352. table= create_table->table;
  3353. }
  3354. else
  3355. {
  3356. Open_table_context ot_ctx(thd, MYSQL_OPEN_TEMPORARY_ONLY);
  3357. if (open_table(thd, create_table, thd->mem_root, &ot_ctx))
  3358. {
  3359. /*
  3360. This shouldn't happen as creation of temporary table should make
  3361. it preparable for open. But let us do close_temporary_table() here
  3362. just in case.
  3363. */
  3364. drop_temporary_table(thd, create_table, NULL);
  3365. }
  3366. else
  3367. table= create_table->table;
  3368. }
  3369. }
  3370. if (!table) // open failed
  3371. DBUG_RETURN(0);
  3372. }
  3373. DBUG_EXECUTE_IF("sleep_create_select_before_lock", my_sleep(6000000););
  3374. table->reginfo.lock_type=TL_WRITE;
  3375. hooks->prelock(&table, 1); // Call prelock hooks
  3376. /*
  3377. mysql_lock_tables() below should never fail with request to reopen table
  3378. since it won't wait for the table lock (we have exclusive metadata lock on
  3379. the table) and thus can't get aborted.
  3380. */
  3381. if (! ((*lock)= mysql_lock_tables(thd, &table, 1, 0)) ||
  3382. hooks->postlock(&table, 1))
  3383. {
  3384. if (*lock)
  3385. {
  3386. mysql_unlock_tables(thd, *lock);
  3387. *lock= 0;
  3388. }
  3389. drop_open_table(thd, table, create_table->db, create_table->table_name);
  3390. DBUG_RETURN(0);
  3391. }
  3392. DBUG_RETURN(table);
  3393. }
  3394. int
  3395. select_create::prepare(List<Item> &values, SELECT_LEX_UNIT *u)
  3396. {
  3397. MYSQL_LOCK *extra_lock= NULL;
  3398. DBUG_ENTER("select_create::prepare");
  3399. TABLEOP_HOOKS *hook_ptr= NULL;
  3400. /*
  3401. For row-based replication, the CREATE-SELECT statement is written
  3402. in two pieces: the first one contain the CREATE TABLE statement
  3403. necessary to create the table and the second part contain the rows
  3404. that should go into the table.
  3405. For non-temporary tables, the start of the CREATE-SELECT
  3406. implicitly commits the previous transaction, and all events
  3407. forming the statement will be stored the transaction cache. At end
  3408. of the statement, the entire statement is committed as a
  3409. transaction, and all events are written to the binary log.
  3410. On the master, the table is locked for the duration of the
  3411. statement, but since the CREATE part is replicated as a simple
  3412. statement, there is no way to lock the table for accesses on the
  3413. slave. Hence, we have to hold on to the CREATE part of the
  3414. statement until the statement has finished.
  3415. */
  3416. class MY_HOOKS : public TABLEOP_HOOKS {
  3417. public:
  3418. MY_HOOKS(select_create *x, TABLE_LIST *create_table_arg,
  3419. TABLE_LIST *select_tables_arg)
  3420. : ptr(x),
  3421. create_table(create_table_arg),
  3422. select_tables(select_tables_arg)
  3423. {
  3424. }
  3425. private:
  3426. virtual int do_postlock(TABLE **tables, uint count)
  3427. {
  3428. int error;
  3429. THD *thd= const_cast<THD*>(ptr->get_thd());
  3430. TABLE_LIST *save_next_global= create_table->next_global;
  3431. create_table->next_global= select_tables;
  3432. error= thd->decide_logging_format(create_table);
  3433. create_table->next_global= save_next_global;
  3434. if (error)
  3435. return error;
  3436. TABLE const *const table = *tables;
  3437. if (thd->is_current_stmt_binlog_format_row() &&
  3438. !table->s->tmp_table)
  3439. {
  3440. if (int error= ptr->binlog_show_create_table(tables, count))
  3441. return error;
  3442. }
  3443. return 0;
  3444. }
  3445. select_create *ptr;
  3446. TABLE_LIST *create_table;
  3447. TABLE_LIST *select_tables;
  3448. };
  3449. MY_HOOKS hooks(this, create_table, select_tables);
  3450. hook_ptr= &hooks;
  3451. unit= u;
  3452. /*
  3453. Start a statement transaction before the create if we are using
  3454. row-based replication for the statement. If we are creating a
  3455. temporary table, we need to start a statement transaction.
  3456. */
  3457. if ((thd->lex->create_info.options & HA_LEX_CREATE_TMP_TABLE) == 0 &&
  3458. thd->is_current_stmt_binlog_format_row() &&
  3459. mysql_bin_log.is_open())
  3460. {
  3461. thd->binlog_start_trans_and_stmt();
  3462. }
  3463. DBUG_ASSERT(create_table->table == NULL);
  3464. DBUG_EXECUTE_IF("sleep_create_select_before_check_if_exists", my_sleep(6000000););
  3465. if (!(table= create_table_from_items(thd, create_info, create_table,
  3466. alter_info, &values,
  3467. &extra_lock, hook_ptr)))
  3468. /* abort() deletes table */
  3469. DBUG_RETURN(-1);
  3470. if (extra_lock)
  3471. {
  3472. DBUG_ASSERT(m_plock == NULL);
  3473. if (create_info->options & HA_LEX_CREATE_TMP_TABLE)
  3474. m_plock= &m_lock;
  3475. else
  3476. m_plock= &thd->extra_lock;
  3477. *m_plock= extra_lock;
  3478. }
  3479. if (table->s->fields < values.elements)
  3480. {
  3481. my_error(ER_WRONG_VALUE_COUNT_ON_ROW, MYF(0), 1);
  3482. DBUG_RETURN(-1);
  3483. }
  3484. /* First field to copy */
  3485. field= table->field+table->s->fields - values.elements;
  3486. /* Mark all fields that are given values */
  3487. for (Field **f= field ; *f ; f++)
  3488. bitmap_set_bit(table->write_set, (*f)->field_index);
  3489. /* Don't set timestamp if used */
  3490. table->timestamp_field_type= TIMESTAMP_NO_AUTO_SET;
  3491. table->next_number_field=table->found_next_number_field;
  3492. restore_record(table,s->default_values); // Get empty record
  3493. thd->cuted_fields=0;
  3494. if (info.ignore || info.handle_duplicates != DUP_ERROR)
  3495. table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
  3496. if (info.handle_duplicates == DUP_REPLACE &&
  3497. (!table->triggers || !table->triggers->has_delete_triggers()))
  3498. table->file->extra(HA_EXTRA_WRITE_CAN_REPLACE);
  3499. if (info.handle_duplicates == DUP_UPDATE)
  3500. table->file->extra(HA_EXTRA_INSERT_WITH_UPDATE);
  3501. if (thd->locked_tables_mode <= LTM_LOCK_TABLES)
  3502. table->file->ha_start_bulk_insert((ha_rows) 0);
  3503. thd->abort_on_warning= (!info.ignore &&
  3504. (thd->variables.sql_mode &
  3505. (MODE_STRICT_TRANS_TABLES |
  3506. MODE_STRICT_ALL_TABLES)));
  3507. if (check_that_all_fields_are_given_values(thd, table, table_list))
  3508. DBUG_RETURN(1);
  3509. table->mark_columns_needed_for_insert();
  3510. table->file->extra(HA_EXTRA_WRITE_CACHE);
  3511. DBUG_RETURN(0);
  3512. }
  3513. int
  3514. select_create::binlog_show_create_table(TABLE **tables, uint count)
  3515. {
  3516. /*
  3517. Note 1: In RBR mode, we generate a CREATE TABLE statement for the
  3518. created table by calling store_create_info() (behaves as SHOW
  3519. CREATE TABLE). In the event of an error, nothing should be
  3520. written to the binary log, even if the table is non-transactional;
  3521. therefore we pretend that the generated CREATE TABLE statement is
  3522. for a transactional table. The event will then be put in the
  3523. transaction cache, and any subsequent events (e.g., table-map
  3524. events and binrow events) will also be put there. We can then use
  3525. ha_autocommit_or_rollback() to either throw away the entire
  3526. kaboodle of events, or write them to the binary log.
  3527. We write the CREATE TABLE statement here and not in prepare()
  3528. since there potentially are sub-selects or accesses to information
  3529. schema that will do a close_thread_tables(), destroying the
  3530. statement transaction cache.
  3531. */
  3532. DBUG_ASSERT(thd->is_current_stmt_binlog_format_row());
  3533. DBUG_ASSERT(tables && *tables && count > 0);
  3534. char buf[2048];
  3535. String query(buf, sizeof(buf), system_charset_info);
  3536. int result;
  3537. TABLE_LIST tmp_table_list;
  3538. memset(&tmp_table_list, 0, sizeof(tmp_table_list));
  3539. tmp_table_list.table = *tables;
  3540. query.length(0); // Have to zero it since constructor doesn't
  3541. result= store_create_info(thd, &tmp_table_list, &query, create_info,
  3542. /* show_database */ TRUE);
  3543. DBUG_ASSERT(result == 0); /* store_create_info() always return 0 */
  3544. if (mysql_bin_log.is_open())
  3545. {
  3546. int errcode= query_error_code(thd, thd->killed == THD::NOT_KILLED);
  3547. result= thd->binlog_query(THD::STMT_QUERY_TYPE,
  3548. query.ptr(), query.length(),
  3549. /* is_trans */ TRUE,
  3550. /* direct */ FALSE,
  3551. /* suppress_use */ FALSE,
  3552. errcode);
  3553. }
  3554. return result;
  3555. }
  3556. void select_create::store_values(List<Item> &values)
  3557. {
  3558. fill_record_n_invoke_before_triggers(thd, field, values, 1,
  3559. table->triggers, TRG_EVENT_INSERT);
  3560. }
  3561. void select_create::send_error(uint errcode,const char *err)
  3562. {
  3563. DBUG_ENTER("select_create::send_error");
  3564. DBUG_PRINT("info",
  3565. ("Current statement %s row-based",
  3566. thd->is_current_stmt_binlog_format_row() ? "is" : "is NOT"));
  3567. DBUG_PRINT("info",
  3568. ("Current table (at 0x%lu) %s a temporary (or non-existant) table",
  3569. (ulong) table,
  3570. table && !table->s->tmp_table ? "is NOT" : "is"));
  3571. /*
  3572. This will execute any rollbacks that are necessary before writing
  3573. the transcation cache.
  3574. We disable the binary log since nothing should be written to the
  3575. binary log. This disabling is important, since we potentially do
  3576. a "roll back" of non-transactional tables by removing the table,
  3577. and the actual rollback might generate events that should not be
  3578. written to the binary log.
  3579. */
  3580. tmp_disable_binlog(thd);
  3581. select_insert::send_error(errcode, err);
  3582. reenable_binlog(thd);
  3583. DBUG_VOID_RETURN;
  3584. }
  3585. bool select_create::send_eof()
  3586. {
  3587. bool tmp=select_insert::send_eof();
  3588. if (tmp)
  3589. abort();
  3590. else
  3591. {
  3592. /*
  3593. Do an implicit commit at end of statement for non-temporary
  3594. tables. This can fail, but we should unlock the table
  3595. nevertheless.
  3596. */
  3597. if (!table->s->tmp_table)
  3598. {
  3599. trans_commit_stmt(thd);
  3600. trans_commit_implicit(thd);
  3601. }
  3602. table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
  3603. table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
  3604. if (m_plock)
  3605. {
  3606. mysql_unlock_tables(thd, *m_plock);
  3607. *m_plock= NULL;
  3608. m_plock= NULL;
  3609. }
  3610. }
  3611. return tmp;
  3612. }
  3613. void select_create::abort_result_set()
  3614. {
  3615. DBUG_ENTER("select_create::abort_result_set");
  3616. /*
  3617. In select_insert::abort() we roll back the statement, including
  3618. truncating the transaction cache of the binary log. To do this, we
  3619. pretend that the statement is transactional, even though it might
  3620. be the case that it was not.
  3621. We roll back the statement prior to deleting the table and prior
  3622. to releasing the lock on the table, since there might be potential
  3623. for failure if the rollback is executed after the drop or after
  3624. unlocking the table.
  3625. We also roll back the statement regardless of whether the creation
  3626. of the table succeeded or not, since we need to reset the binary
  3627. log state.
  3628. */
  3629. tmp_disable_binlog(thd);
  3630. select_insert::abort_result_set();
  3631. thd->transaction.stmt.modified_non_trans_table= FALSE;
  3632. reenable_binlog(thd);
  3633. /* possible error of writing binary log is ignored deliberately */
  3634. (void) thd->binlog_flush_pending_rows_event(TRUE, TRUE);
  3635. if (m_plock)
  3636. {
  3637. mysql_unlock_tables(thd, *m_plock);
  3638. *m_plock= NULL;
  3639. m_plock= NULL;
  3640. }
  3641. if (table)
  3642. {
  3643. table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
  3644. table->file->extra(HA_EXTRA_WRITE_CANNOT_REPLACE);
  3645. table->auto_increment_field_not_null= FALSE;
  3646. drop_open_table(thd, table, create_table->db, create_table->table_name);
  3647. table=0; // Safety
  3648. }
  3649. DBUG_VOID_RETURN;
  3650. }
  3651. /*****************************************************************************
  3652. Instansiate templates
  3653. *****************************************************************************/
  3654. #ifdef HAVE_EXPLICIT_TEMPLATE_INSTANTIATION
  3655. template class List_iterator_fast<List_item>;
  3656. #ifndef EMBEDDED_LIBRARY
  3657. template class I_List<Delayed_insert>;
  3658. template class I_List_iterator<Delayed_insert>;
  3659. template class I_List<delayed_row>;
  3660. #endif /* EMBEDDED_LIBRARY */
  3661. #endif /* HAVE_EXPLICIT_TEMPLATE_INSTANTIATION */