You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

7075 lines
236 KiB

14 years ago
14 years ago
13 years ago
13 years ago
BUG#834534: Assertion `0' failed in replace_where_subcondition with semijoin subquery in HAVING - The problem was that the code that made the check whether the subquery is an AND-part of the WHERE clause didn't work correctly for nested subqueries. In particular, grand-child subquery in HAVING was treated as if it was in the WHERE, which eventually caused an assert when replace_where_subcondition looked for the subquery predicate in the WHERE and couldn't find it there. - The fix: Removed implementation of "thd_marker approach". thd->thd_marker was used to determine the location of subquery predicate: setup_conds() would set accordingly it when making the {where|on_expr}->fix_fields(...) call so that AND-parts of the WHERE/ON clauses can determine they are the AND-parts. Item_cond_or::fix_fields(), Item_func::fix_fields(), Item_subselect::fix_fields (this one was missed), and all other items-that-contain-items had to reset thd->thd_marker before calling fix_fields() for their children items, so that the children can see they are not AND-parts of WHERE/ON. - The "thd_marker approach" required that a lot of code in different locations maintains correct value of thd->thd_marker, so it was replaced with: - The new approach with mark_as_condition_AND_part does not keep context in thd->thd_marker. Instead, setup_conds() now calls {where|on_expr}->mark_as_condition_AND_part() and implementations of that function make sure that: - parts of AND-expressions get the mark_as_condition_AND_part() call - Item_in_subselect objects record that they are AND-parts of WHERE/ON
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
BUG#834534: Assertion `0' failed in replace_where_subcondition with semijoin subquery in HAVING - The problem was that the code that made the check whether the subquery is an AND-part of the WHERE clause didn't work correctly for nested subqueries. In particular, grand-child subquery in HAVING was treated as if it was in the WHERE, which eventually caused an assert when replace_where_subcondition looked for the subquery predicate in the WHERE and couldn't find it there. - The fix: Removed implementation of "thd_marker approach". thd->thd_marker was used to determine the location of subquery predicate: setup_conds() would set accordingly it when making the {where|on_expr}->fix_fields(...) call so that AND-parts of the WHERE/ON clauses can determine they are the AND-parts. Item_cond_or::fix_fields(), Item_func::fix_fields(), Item_subselect::fix_fields (this one was missed), and all other items-that-contain-items had to reset thd->thd_marker before calling fix_fields() for their children items, so that the children can see they are not AND-parts of WHERE/ON. - The "thd_marker approach" required that a lot of code in different locations maintains correct value of thd->thd_marker, so it was replaced with: - The new approach with mark_as_condition_AND_part does not keep context in thd->thd_marker. Instead, setup_conds() now calls {where|on_expr}->mark_as_condition_AND_part() and implementations of that function make sure that: - parts of AND-expressions get the mark_as_condition_AND_part() call - Item_in_subselect objects record that they are AND-parts of WHERE/ON
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Merge with 5.1-microseconds A lot of small fixes and new test cases. client/mysqlbinlog.cc: Cast removed client/mysqltest.cc: Added missing DBUG_RETURN include/my_pthread.h: set_timespec_time_nsec() now only takes one argument mysql-test/t/date_formats.test: Remove --disable_ps_protocl as now also ps supports microseconds mysys/my_uuid.c: Changed to use my_interval_timer() instead of my_getsystime() mysys/waiting_threads.c: Changed to use my_hrtime() sql/field.h: Added bool special_const_compare() for fields that may convert values before compare (like year) sql/field_conv.cc: Added test to get optimal copying of identical temporal values. sql/item.cc: Return that item_int is equal if it's positive, even if unsigned flag is different. Fixed Item_cache_str::save_in_field() to have identical null check as other similar functions Added proper NULL check to Item_cache_int::save_in_field() sql/item_cmpfunc.cc: Don't call convert_constant_item() if there is nothing that is worth converting. Simplified test when years should be converted sql/item_sum.cc: Mark cache values in Item_sum_hybrid as not constants to ensure they are not replaced by other cache values in compare_datetime() sql/item_timefunc.cc: Changed sec_to_time() to take a my_decimal argument to ensure we don't loose any sub seconds. Added Item_temporal_func::get_time() (This simplifies some things) sql/mysql_priv.h: Added Lazy_string_decimal() sql/mysqld.cc: Added my_decimal constants max_seconds_for_time_type, time_second_part_factor sql/table.cc: Changed expr_arena to be of type CONVENTIONAL_EXECUTION to ensure that we don't loose any items that are created by fix_fields() sql/tztime.cc: TIME_to_gmt_sec() now sets *in_dst_time_gap in case of errors This is needed to be able to detect if timestamp is 0 storage/maria/lockman.c: Changed from my_getsystime() to set_timespec_time_nsec() storage/maria/ma_loghandler.c: Changed from my_getsystime() to my_hrtime() storage/maria/ma_recovery.c: Changed from my_getsystime() to mmicrosecond_interval_timer() storage/maria/unittest/trnman-t.c: Changed from my_getsystime() to mmicrosecond_interval_timer() storage/xtradb/handler/ha_innodb.cc: Added support for new time,datetime and timestamp unittest/mysys/thr_template.c: my_getsystime() -> my_interval_timer() unittest/mysys/waiting_threads-t.c: my_getsystime() -> my_interval_timer()
15 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
6 years ago
Back-ported the patch of the mysql-5.6 code line that fixed several defects in the greedy optimization: 1) The greedy optimizer calculated the 'compare-cost' (CPU-cost) for iterating over the partial plan result at each level in the query plan as 'record_count / (double) TIME_FOR_COMPARE' This cost was only used locally for 'best' calculation at each level, and *not* accumulated into the total cost for the query plan. This fix added the 'CPU-cost' of processing 'current_record_count' records at each level to 'current_read_time' *before* it is used as 'accumulated cost' argument to recursive best_extension_by_limited_search() calls. This ensured that the cost of a huge join-fanout early in the QEP was correctly reflected in the cost of the final QEP. To get identical cost for a 'best' optimized query and a straight_join with the same join order, the same change was also applied to optimize_straight_join() and get_partial_join_cost() 2) Furthermore to get equal cost for 'best' optimized query and a straight_join the new code substrcated the same '0.001' in optimize_straight_join() as it had been already done in best_extension_by_limited_search() 3) When best_extension_by_limited_search() aggregated the 'best' plan a plan was 'best' by the check : 'if ((search_depth == 1) || (current_read_time < join->best_read))' The term '(search_depth == 1' incorrectly caused a new best plan to be collected whenever the specified 'search_depth' was reached - even if this partial query plan was more expensive than what we had already found.
14 years ago
Subquery cache (MWL#66) added. libmysqld/Makefile.am: The new file added. mysql-test/r/index_merge_myisam.result: subquery_cache optimization option added. mysql-test/r/myisam_mrr.result: subquery_cache optimization option added. mysql-test/r/subquery_cache.result: The subquery cache tests added. mysql-test/r/subselect3.result: Subquery cache switched off to avoid changing read statistics. mysql-test/r/subselect3_jcl6.result: Subquery cache switched off to avoid changing read statistics. mysql-test/r/subselect_no_mat.result: subquery_cache optimization option added. mysql-test/r/subselect_no_opts.result: subquery_cache optimization option added. mysql-test/r/subselect_no_semijoin.result: subquery_cache optimization option added. mysql-test/r/subselect_sj.result: subquery_cache optimization option added. mysql-test/r/subselect_sj_jcl6.result: subquery_cache optimization option added. mysql-test/t/subquery_cache.test: The subquery cache tests added. mysql-test/t/subselect3.test: Subquery cache switched off to avoid changing read statistics. sql/CMakeLists.txt: The new file added. sql/Makefile.am: The new files added. sql/item.cc: Expression cache item (Item_cache_wrapper) added. Item_ref and Item_field fixed for correct usage of result field and fast resolwing in SP. sql/item.h: Expression cache item (Item_cache_wrapper) added. Item_ref and Item_field fixed for correct usage of result field and fast resolwing in SP. sql/item_cmpfunc.cc: Subquery cache added. sql/item_cmpfunc.h: Subquery cache added. sql/item_subselect.cc: Subquery cache added. sql/item_subselect.h: Subquery cache added. sql/item_sum.cc: Registration of subquery parameters added. sql/mysql_priv.h: subquery_cache optimization option added. sql/mysqld.cc: subquery_cache optimization option added. sql/opt_range.cc: Fix due to subquery cache. sql/opt_subselect.cc: Parameters of the function cahnged. sql/procedure.h: .h file guard added. sql/sql_base.cc: Registration of subquery parameters added. sql/sql_class.cc: Option to allow add indeces to temporary table. sql/sql_class.h: Item iterators added. Option to allow add indeces to temporary table. sql/sql_expression_cache.cc: Expression cache for caching subqueries added. sql/sql_expression_cache.h: Expression cache for caching subqueries added. sql/sql_lex.cc: Registration of subquery parameters added. sql/sql_lex.h: Registration of subqueries and subquery parameters added. sql/sql_select.cc: Subquery cache added. sql/sql_select.h: Subquery cache added. sql/sql_union.cc: A new parameter to the function added. sql/sql_update.cc: A new parameter to the function added. sql/table.cc: Procedures to manage temporarty tables index added. sql/table.h: Procedures to manage temporarty tables index added. storage/maria/ha_maria.cc: Fix of handler to allow destoy a table in case of error during the table creation. storage/maria/ha_maria.h: .h file guard added. storage/myisam/ha_myisam.cc: Fix of handler to allow destoy a table in case of error during the table creation.
16 years ago
Fixed LP bugs #717577, #724942. Both these two bugs happened due to the following problem. When a view column is referenced in the query an Item_direct_view_ref object is created that is refers to the Item_field for the column. All references to the same view column refer to the same Item_field. Different references can belong to different AND/OR levels and, as a result, can be included in different Item_equal object. These Item_equal objects may include different constant objects. If these constant objects are substituted for the Item_field created for a view column we have a conflict situation when the second substitution annuls the first substitution. This leads to wrong result sets returned by the query. Bug #724942 demonstrates such an erroneous behaviour. Test case of the bug #717577 produces wrong result sets because best equal fields of the multiple equalities built for different OR levels of the WHERE condition differs. The subsitution for the best equal field in the second OR branch overwrites the the substitution made for the first branch. To avoid such conflicts we have to substitute for the references to the view columns rather than for the underlying field items. To make such substitutions possible we have to include into multiple equalities references to view columns rather than field items created for such columns. This patch modifies the Item_equal class to include references to view columns into multiple equality objects. It also performs a clean up of the class methods and adds more comments. The methods of the Item_direct_view_ref class that assist substitutions for references to view columns has been also added by this patch.
15 years ago
Fixed LP bugs #717577, #724942. Both these two bugs happened due to the following problem. When a view column is referenced in the query an Item_direct_view_ref object is created that is refers to the Item_field for the column. All references to the same view column refer to the same Item_field. Different references can belong to different AND/OR levels and, as a result, can be included in different Item_equal object. These Item_equal objects may include different constant objects. If these constant objects are substituted for the Item_field created for a view column we have a conflict situation when the second substitution annuls the first substitution. This leads to wrong result sets returned by the query. Bug #724942 demonstrates such an erroneous behaviour. Test case of the bug #717577 produces wrong result sets because best equal fields of the multiple equalities built for different OR levels of the WHERE condition differs. The subsitution for the best equal field in the second OR branch overwrites the the substitution made for the first branch. To avoid such conflicts we have to substitute for the references to the view columns rather than for the underlying field items. To make such substitutions possible we have to include into multiple equalities references to view columns rather than field items created for such columns. This patch modifies the Item_equal class to include references to view columns into multiple equality objects. It also performs a clean up of the class methods and adds more comments. The methods of the Item_direct_view_ref class that assist substitutions for references to view columns has been also added by this patch.
15 years ago
Fixed LP bugs #717577, #724942. Both these two bugs happened due to the following problem. When a view column is referenced in the query an Item_direct_view_ref object is created that is refers to the Item_field for the column. All references to the same view column refer to the same Item_field. Different references can belong to different AND/OR levels and, as a result, can be included in different Item_equal object. These Item_equal objects may include different constant objects. If these constant objects are substituted for the Item_field created for a view column we have a conflict situation when the second substitution annuls the first substitution. This leads to wrong result sets returned by the query. Bug #724942 demonstrates such an erroneous behaviour. Test case of the bug #717577 produces wrong result sets because best equal fields of the multiple equalities built for different OR levels of the WHERE condition differs. The subsitution for the best equal field in the second OR branch overwrites the the substitution made for the first branch. To avoid such conflicts we have to substitute for the references to the view columns rather than for the underlying field items. To make such substitutions possible we have to include into multiple equalities references to view columns rather than field items created for such columns. This patch modifies the Item_equal class to include references to view columns into multiple equality objects. It also performs a clean up of the class methods and adds more comments. The methods of the Item_direct_view_ref class that assist substitutions for references to view columns has been also added by this patch.
15 years ago
Code cleanup to get fewer reallocs() during execution. - Changed TABLE->alias to String to get fewer reallocs when alias are used. - Preallocate some buffers Changed some String->c_ptr() -> String->ptr() when \0 is not needed. Fixed wrong usage of String->ptr() when we need a \0 terminated string. Use my_strtod() instead of my_atof() to avoid having to add \0 to string. c_ptr() -> c_ptr_safe() to avoid warnings from valgrind. zr sql/event_db_repository.cc: Update usage of TABLE->alias sql/event_scheduler.cc: c_ptr() -> c_ptr_safe() sql/events.cc: c_ptr() -> ptr() as \0 was not needed sql/field.cc: Update usage of TABLE->alias sql/field.h: Update usage of TABLE->alias sql/ha_partition.cc: Update usage of TABLE->alias sql/handler.cc: Update usage of TABLE->alias Fixed wrong usage of str.ptr() sql/item.cc: Fixed error where code wrongly assumed string was \0 terminated. sql/item_func.cc: c_ptr() -> c_ptr_safe() Update usage of TABLE->alias sql/item_sum.h: Use my_strtod() instead of my_atof() to avoid having to add \0 to string sql/lock.cc: Update usage of TABLE->alias sql/log.cc: c_ptr() -> ptr() as \0 was not needed sql/log_event.cc: c_ptr_quick() -> ptr() as \0 was not needed sql/opt_range.cc: ptr() -> c_ptr() as \0 is needed sql/opt_subselect.cc: Update usage of TABLE->alias sql/opt_table_elimination.cc: Update usage of TABLE->alias sql/set_var.cc: ptr() -> c_ptr() as \0 is needed c_ptr() -> c_ptr_safe() sql/sp.cc: c_ptr() -> ptr() as \0 was not needed sql/sp_rcontext.cc: Update usage of TABLE->alias sql/sql_base.cc: Preallocate buffers Update usage of TABLE->alias sql/sql_class.cc: Fix arguments to sprintf() to work even if string is not \0 terminated sql/sql_insert.cc: Update usage of TABLE->alias c_ptr() -> ptr() as \0 was not needed sql/sql_load.cc: Preallocate buffers Trivial optimizations sql/sql_parse.cc: Trivial optimization sql/sql_plugin.cc: c_ptr() -> ptr() as \0 was not needed sql/sql_select.cc: Update usage of TABLE->alias sql/sql_show.cc: Update usage of TABLE->alias sql/sql_string.h: Added move() function to move allocated memory from one object to another. sql/sql_table.cc: Update usage of TABLE->alias c_ptr() -> c_ptr_safe() sql/sql_test.cc: ptr() -> c_ptr_safe() sql/sql_trigger.cc: Update usage of TABLE->alias c_ptr() -> c_ptr_safe() sql/sql_update.cc: Update usage of TABLE->alias sql/sql_view.cc: ptr() -> c_ptr_safe() sql/sql_yacc.yy: ptr() -> c_ptr() sql/table.cc: Update usage of TABLE->alias sql/table.h: Changed TABLE->alias to String to get fewer reallocs when alias are used. storage/federatedx/ha_federatedx.cc: Use c_ptr_safe() to ensure strings are \0 terminated. storage/maria/ha_maria.cc: Update usage of TABLE->alias storage/myisam/ha_myisam.cc: Update usage of TABLE->alias storage/xtradb/row/row0sel.c: Ensure that null bits in record are properly reset. (Old code didn't work as row_search_for_mysql() can be called twice while reading fields from one row.
15 years ago
Changing field::field_name and Item::name to LEX_CSTRING Benefits of this patch: - Removed a lot of calls to strlen(), especially for field_string - Strings generated by parser are now const strings, less chance of accidently changing a string - Removed a lot of calls with LEX_STRING as parameter (changed to pointer) - More uniform code - Item::name_length was not kept up to date. Now fixed - Several bugs found and fixed (Access to null pointers, access of freed memory, wrong arguments to printf like functions) - Removed a lot of casts from (const char*) to (char*) Changes: - This caused some ABI changes - lex_string_set now uses LEX_CSTRING - Some fucntions are now taking const char* instead of char* - Create_field::change and after changed to LEX_CSTRING - handler::connect_string, comment and engine_name() changed to LEX_CSTRING - Checked printf() related calls to find bugs. Found and fixed several errors in old code. - A lot of changes from LEX_STRING to LEX_CSTRING, especially related to parsing and events. - Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING* - Some changes for char* to const char* - Added printf argument checking for my_snprintf() - Introduced null_clex_str, star_clex_string, temp_lex_str to simplify code - Added item_empty_name and item_used_name to be able to distingush between items that was given an empty name and items that was not given a name This is used in sql_yacc.yy to know when to give an item a name. - select table_name."*' is not anymore same as table_name.* - removed not used function Item::rename() - Added comparision of item->name_length before some calls to my_strcasecmp() to speed up comparison - Moved Item_sp_variable::make_field() from item.h to item.cc - Some minimal code changes to avoid copying to const char * - Fixed wrong error message in wsrep_mysql_parse() - Fixed wrong code in find_field_in_natural_join() where real_item() was set when it shouldn't - ER_ERROR_ON_RENAME was used with extra arguments. - Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already give the error. TODO: - Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c - Change code to not modify LEX_CSTRING for database name (as part of lower_case_table_names)
9 years ago
Changing field::field_name and Item::name to LEX_CSTRING Benefits of this patch: - Removed a lot of calls to strlen(), especially for field_string - Strings generated by parser are now const strings, less chance of accidently changing a string - Removed a lot of calls with LEX_STRING as parameter (changed to pointer) - More uniform code - Item::name_length was not kept up to date. Now fixed - Several bugs found and fixed (Access to null pointers, access of freed memory, wrong arguments to printf like functions) - Removed a lot of casts from (const char*) to (char*) Changes: - This caused some ABI changes - lex_string_set now uses LEX_CSTRING - Some fucntions are now taking const char* instead of char* - Create_field::change and after changed to LEX_CSTRING - handler::connect_string, comment and engine_name() changed to LEX_CSTRING - Checked printf() related calls to find bugs. Found and fixed several errors in old code. - A lot of changes from LEX_STRING to LEX_CSTRING, especially related to parsing and events. - Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING* - Some changes for char* to const char* - Added printf argument checking for my_snprintf() - Introduced null_clex_str, star_clex_string, temp_lex_str to simplify code - Added item_empty_name and item_used_name to be able to distingush between items that was given an empty name and items that was not given a name This is used in sql_yacc.yy to know when to give an item a name. - select table_name."*' is not anymore same as table_name.* - removed not used function Item::rename() - Added comparision of item->name_length before some calls to my_strcasecmp() to speed up comparison - Moved Item_sp_variable::make_field() from item.h to item.cc - Some minimal code changes to avoid copying to const char * - Fixed wrong error message in wsrep_mysql_parse() - Fixed wrong code in find_field_in_natural_join() where real_item() was set when it shouldn't - ER_ERROR_ON_RENAME was used with extra arguments. - Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already give the error. TODO: - Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c - Change code to not modify LEX_CSTRING for database name (as part of lower_case_table_names)
9 years ago
Code cleanup to get fewer reallocs() during execution. - Changed TABLE->alias to String to get fewer reallocs when alias are used. - Preallocate some buffers Changed some String->c_ptr() -> String->ptr() when \0 is not needed. Fixed wrong usage of String->ptr() when we need a \0 terminated string. Use my_strtod() instead of my_atof() to avoid having to add \0 to string. c_ptr() -> c_ptr_safe() to avoid warnings from valgrind. zr sql/event_db_repository.cc: Update usage of TABLE->alias sql/event_scheduler.cc: c_ptr() -> c_ptr_safe() sql/events.cc: c_ptr() -> ptr() as \0 was not needed sql/field.cc: Update usage of TABLE->alias sql/field.h: Update usage of TABLE->alias sql/ha_partition.cc: Update usage of TABLE->alias sql/handler.cc: Update usage of TABLE->alias Fixed wrong usage of str.ptr() sql/item.cc: Fixed error where code wrongly assumed string was \0 terminated. sql/item_func.cc: c_ptr() -> c_ptr_safe() Update usage of TABLE->alias sql/item_sum.h: Use my_strtod() instead of my_atof() to avoid having to add \0 to string sql/lock.cc: Update usage of TABLE->alias sql/log.cc: c_ptr() -> ptr() as \0 was not needed sql/log_event.cc: c_ptr_quick() -> ptr() as \0 was not needed sql/opt_range.cc: ptr() -> c_ptr() as \0 is needed sql/opt_subselect.cc: Update usage of TABLE->alias sql/opt_table_elimination.cc: Update usage of TABLE->alias sql/set_var.cc: ptr() -> c_ptr() as \0 is needed c_ptr() -> c_ptr_safe() sql/sp.cc: c_ptr() -> ptr() as \0 was not needed sql/sp_rcontext.cc: Update usage of TABLE->alias sql/sql_base.cc: Preallocate buffers Update usage of TABLE->alias sql/sql_class.cc: Fix arguments to sprintf() to work even if string is not \0 terminated sql/sql_insert.cc: Update usage of TABLE->alias c_ptr() -> ptr() as \0 was not needed sql/sql_load.cc: Preallocate buffers Trivial optimizations sql/sql_parse.cc: Trivial optimization sql/sql_plugin.cc: c_ptr() -> ptr() as \0 was not needed sql/sql_select.cc: Update usage of TABLE->alias sql/sql_show.cc: Update usage of TABLE->alias sql/sql_string.h: Added move() function to move allocated memory from one object to another. sql/sql_table.cc: Update usage of TABLE->alias c_ptr() -> c_ptr_safe() sql/sql_test.cc: ptr() -> c_ptr_safe() sql/sql_trigger.cc: Update usage of TABLE->alias c_ptr() -> c_ptr_safe() sql/sql_update.cc: Update usage of TABLE->alias sql/sql_view.cc: ptr() -> c_ptr_safe() sql/sql_yacc.yy: ptr() -> c_ptr() sql/table.cc: Update usage of TABLE->alias sql/table.h: Changed TABLE->alias to String to get fewer reallocs when alias are used. storage/federatedx/ha_federatedx.cc: Use c_ptr_safe() to ensure strings are \0 terminated. storage/maria/ha_maria.cc: Update usage of TABLE->alias storage/myisam/ha_myisam.cc: Update usage of TABLE->alias storage/xtradb/row/row0sel.c: Ensure that null bits in record are properly reset. (Old code didn't work as row_search_for_mysql() can be called twice while reading fields from one row.
15 years ago
Changing field::field_name and Item::name to LEX_CSTRING Benefits of this patch: - Removed a lot of calls to strlen(), especially for field_string - Strings generated by parser are now const strings, less chance of accidently changing a string - Removed a lot of calls with LEX_STRING as parameter (changed to pointer) - More uniform code - Item::name_length was not kept up to date. Now fixed - Several bugs found and fixed (Access to null pointers, access of freed memory, wrong arguments to printf like functions) - Removed a lot of casts from (const char*) to (char*) Changes: - This caused some ABI changes - lex_string_set now uses LEX_CSTRING - Some fucntions are now taking const char* instead of char* - Create_field::change and after changed to LEX_CSTRING - handler::connect_string, comment and engine_name() changed to LEX_CSTRING - Checked printf() related calls to find bugs. Found and fixed several errors in old code. - A lot of changes from LEX_STRING to LEX_CSTRING, especially related to parsing and events. - Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING* - Some changes for char* to const char* - Added printf argument checking for my_snprintf() - Introduced null_clex_str, star_clex_string, temp_lex_str to simplify code - Added item_empty_name and item_used_name to be able to distingush between items that was given an empty name and items that was not given a name This is used in sql_yacc.yy to know when to give an item a name. - select table_name."*' is not anymore same as table_name.* - removed not used function Item::rename() - Added comparision of item->name_length before some calls to my_strcasecmp() to speed up comparison - Moved Item_sp_variable::make_field() from item.h to item.cc - Some minimal code changes to avoid copying to const char * - Fixed wrong error message in wsrep_mysql_parse() - Fixed wrong code in find_field_in_natural_join() where real_item() was set when it shouldn't - ER_ERROR_ON_RENAME was used with extra arguments. - Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already give the error. TODO: - Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c - Change code to not modify LEX_CSTRING for database name (as part of lower_case_table_names)
9 years ago
Changing field::field_name and Item::name to LEX_CSTRING Benefits of this patch: - Removed a lot of calls to strlen(), especially for field_string - Strings generated by parser are now const strings, less chance of accidently changing a string - Removed a lot of calls with LEX_STRING as parameter (changed to pointer) - More uniform code - Item::name_length was not kept up to date. Now fixed - Several bugs found and fixed (Access to null pointers, access of freed memory, wrong arguments to printf like functions) - Removed a lot of casts from (const char*) to (char*) Changes: - This caused some ABI changes - lex_string_set now uses LEX_CSTRING - Some fucntions are now taking const char* instead of char* - Create_field::change and after changed to LEX_CSTRING - handler::connect_string, comment and engine_name() changed to LEX_CSTRING - Checked printf() related calls to find bugs. Found and fixed several errors in old code. - A lot of changes from LEX_STRING to LEX_CSTRING, especially related to parsing and events. - Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING* - Some changes for char* to const char* - Added printf argument checking for my_snprintf() - Introduced null_clex_str, star_clex_string, temp_lex_str to simplify code - Added item_empty_name and item_used_name to be able to distingush between items that was given an empty name and items that was not given a name This is used in sql_yacc.yy to know when to give an item a name. - select table_name."*' is not anymore same as table_name.* - removed not used function Item::rename() - Added comparision of item->name_length before some calls to my_strcasecmp() to speed up comparison - Moved Item_sp_variable::make_field() from item.h to item.cc - Some minimal code changes to avoid copying to const char * - Fixed wrong error message in wsrep_mysql_parse() - Fixed wrong code in find_field_in_natural_join() where real_item() was set when it shouldn't - ER_ERROR_ON_RENAME was used with extra arguments. - Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already give the error. TODO: - Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c - Change code to not modify LEX_CSTRING for database name (as part of lower_case_table_names)
9 years ago
Changing field::field_name and Item::name to LEX_CSTRING Benefits of this patch: - Removed a lot of calls to strlen(), especially for field_string - Strings generated by parser are now const strings, less chance of accidently changing a string - Removed a lot of calls with LEX_STRING as parameter (changed to pointer) - More uniform code - Item::name_length was not kept up to date. Now fixed - Several bugs found and fixed (Access to null pointers, access of freed memory, wrong arguments to printf like functions) - Removed a lot of casts from (const char*) to (char*) Changes: - This caused some ABI changes - lex_string_set now uses LEX_CSTRING - Some fucntions are now taking const char* instead of char* - Create_field::change and after changed to LEX_CSTRING - handler::connect_string, comment and engine_name() changed to LEX_CSTRING - Checked printf() related calls to find bugs. Found and fixed several errors in old code. - A lot of changes from LEX_STRING to LEX_CSTRING, especially related to parsing and events. - Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING* - Some changes for char* to const char* - Added printf argument checking for my_snprintf() - Introduced null_clex_str, star_clex_string, temp_lex_str to simplify code - Added item_empty_name and item_used_name to be able to distingush between items that was given an empty name and items that was not given a name This is used in sql_yacc.yy to know when to give an item a name. - select table_name."*' is not anymore same as table_name.* - removed not used function Item::rename() - Added comparision of item->name_length before some calls to my_strcasecmp() to speed up comparison - Moved Item_sp_variable::make_field() from item.h to item.cc - Some minimal code changes to avoid copying to const char * - Fixed wrong error message in wsrep_mysql_parse() - Fixed wrong code in find_field_in_natural_join() where real_item() was set when it shouldn't - ER_ERROR_ON_RENAME was used with extra arguments. - Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already give the error. TODO: - Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c - Change code to not modify LEX_CSTRING for database name (as part of lower_case_table_names)
9 years ago
Changing field::field_name and Item::name to LEX_CSTRING Benefits of this patch: - Removed a lot of calls to strlen(), especially for field_string - Strings generated by parser are now const strings, less chance of accidently changing a string - Removed a lot of calls with LEX_STRING as parameter (changed to pointer) - More uniform code - Item::name_length was not kept up to date. Now fixed - Several bugs found and fixed (Access to null pointers, access of freed memory, wrong arguments to printf like functions) - Removed a lot of casts from (const char*) to (char*) Changes: - This caused some ABI changes - lex_string_set now uses LEX_CSTRING - Some fucntions are now taking const char* instead of char* - Create_field::change and after changed to LEX_CSTRING - handler::connect_string, comment and engine_name() changed to LEX_CSTRING - Checked printf() related calls to find bugs. Found and fixed several errors in old code. - A lot of changes from LEX_STRING to LEX_CSTRING, especially related to parsing and events. - Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING* - Some changes for char* to const char* - Added printf argument checking for my_snprintf() - Introduced null_clex_str, star_clex_string, temp_lex_str to simplify code - Added item_empty_name and item_used_name to be able to distingush between items that was given an empty name and items that was not given a name This is used in sql_yacc.yy to know when to give an item a name. - select table_name."*' is not anymore same as table_name.* - removed not used function Item::rename() - Added comparision of item->name_length before some calls to my_strcasecmp() to speed up comparison - Moved Item_sp_variable::make_field() from item.h to item.cc - Some minimal code changes to avoid copying to const char * - Fixed wrong error message in wsrep_mysql_parse() - Fixed wrong code in find_field_in_natural_join() where real_item() was set when it shouldn't - ER_ERROR_ON_RENAME was used with extra arguments. - Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already give the error. TODO: - Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c - Change code to not modify LEX_CSTRING for database name (as part of lower_case_table_names)
9 years ago
Changing field::field_name and Item::name to LEX_CSTRING Benefits of this patch: - Removed a lot of calls to strlen(), especially for field_string - Strings generated by parser are now const strings, less chance of accidently changing a string - Removed a lot of calls with LEX_STRING as parameter (changed to pointer) - More uniform code - Item::name_length was not kept up to date. Now fixed - Several bugs found and fixed (Access to null pointers, access of freed memory, wrong arguments to printf like functions) - Removed a lot of casts from (const char*) to (char*) Changes: - This caused some ABI changes - lex_string_set now uses LEX_CSTRING - Some fucntions are now taking const char* instead of char* - Create_field::change and after changed to LEX_CSTRING - handler::connect_string, comment and engine_name() changed to LEX_CSTRING - Checked printf() related calls to find bugs. Found and fixed several errors in old code. - A lot of changes from LEX_STRING to LEX_CSTRING, especially related to parsing and events. - Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING* - Some changes for char* to const char* - Added printf argument checking for my_snprintf() - Introduced null_clex_str, star_clex_string, temp_lex_str to simplify code - Added item_empty_name and item_used_name to be able to distingush between items that was given an empty name and items that was not given a name This is used in sql_yacc.yy to know when to give an item a name. - select table_name."*' is not anymore same as table_name.* - removed not used function Item::rename() - Added comparision of item->name_length before some calls to my_strcasecmp() to speed up comparison - Moved Item_sp_variable::make_field() from item.h to item.cc - Some minimal code changes to avoid copying to const char * - Fixed wrong error message in wsrep_mysql_parse() - Fixed wrong code in find_field_in_natural_join() where real_item() was set when it shouldn't - ER_ERROR_ON_RENAME was used with extra arguments. - Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already give the error. TODO: - Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c - Change code to not modify LEX_CSTRING for database name (as part of lower_case_table_names)
9 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
12 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
Fix bug lp:985667, MDEV-229 Analysis: The reason for the wrong result is the interaction between constant optimization (in this case 1-row table) and subquery optimization. - First the outer query is optimized, and 'make_join_statistics' finds that table t2 has one row, reads that row, and marks the whole table as constant. This also means that all fields of t2 are constant. - Next, we optimize the subquery in the end of the outer 'make_join_statistics'. The field 'f2' is considered constant, with value '3'. The subquery predicate is rewritten as the constant TRUE. - The outer query execution detects early that the whole query result is empty and calls 'return_zero_rows'. Since the query is with implicit grouping, we have to produce one row with special values for the aggregates (depending on each aggregate function), and NULL values for all non-aggregate fields. This function calls 'no_rows_in_result' to set each aggregate function to the default value when it aggregates over an empty result, and then calls 'send_data', which in turn evaluates each Item in the SELECT list. - When evaluation reaches the subquery predicate, it executes the subquery with field 'f2' having a constant value '3', and the subquery produces the incorrect result '7'. Solution: Implement Item::no_rows_in_result for all subquery predicates. In order to make this work, it is also needed to make all val_* methods of all subquery predicates respect the Item_subselect::forced_const flag. Otherwise subqueries are executed anyways, and override the default value set by no_rows_in_result with whatever result is produced from the subquery evaluation.
14 years ago
Fix bug lp:985667, MDEV-229 Analysis: The reason for the wrong result is the interaction between constant optimization (in this case 1-row table) and subquery optimization. - First the outer query is optimized, and 'make_join_statistics' finds that table t2 has one row, reads that row, and marks the whole table as constant. This also means that all fields of t2 are constant. - Next, we optimize the subquery in the end of the outer 'make_join_statistics'. The field 'f2' is considered constant, with value '3'. The subquery predicate is rewritten as the constant TRUE. - The outer query execution detects early that the whole query result is empty and calls 'return_zero_rows'. Since the query is with implicit grouping, we have to produce one row with special values for the aggregates (depending on each aggregate function), and NULL values for all non-aggregate fields. This function calls 'no_rows_in_result' to set each aggregate function to the default value when it aggregates over an empty result, and then calls 'send_data', which in turn evaluates each Item in the SELECT list. - When evaluation reaches the subquery predicate, it executes the subquery with field 'f2' having a constant value '3', and the subquery produces the incorrect result '7'. Solution: Implement Item::no_rows_in_result for all subquery predicates. In order to make this work, it is also needed to make all val_* methods of all subquery predicates respect the Item_subselect::forced_const flag. Otherwise subqueries are executed anyways, and override the default value set by no_rows_in_result with whatever result is produced from the subquery evaluation.
14 years ago
Fix bug lp:985667, MDEV-229 Analysis: The reason for the wrong result is the interaction between constant optimization (in this case 1-row table) and subquery optimization. - First the outer query is optimized, and 'make_join_statistics' finds that table t2 has one row, reads that row, and marks the whole table as constant. This also means that all fields of t2 are constant. - Next, we optimize the subquery in the end of the outer 'make_join_statistics'. The field 'f2' is considered constant, with value '3'. The subquery predicate is rewritten as the constant TRUE. - The outer query execution detects early that the whole query result is empty and calls 'return_zero_rows'. Since the query is with implicit grouping, we have to produce one row with special values for the aggregates (depending on each aggregate function), and NULL values for all non-aggregate fields. This function calls 'no_rows_in_result' to set each aggregate function to the default value when it aggregates over an empty result, and then calls 'send_data', which in turn evaluates each Item in the SELECT list. - When evaluation reaches the subquery predicate, it executes the subquery with field 'f2' having a constant value '3', and the subquery produces the incorrect result '7'. Solution: Implement Item::no_rows_in_result for all subquery predicates. In order to make this work, it is also needed to make all val_* methods of all subquery predicates respect the Item_subselect::forced_const flag. Otherwise subqueries are executed anyways, and override the default value set by no_rows_in_result with whatever result is produced from the subquery evaluation.
14 years ago
Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods. ****** Fix MySQL BUG#12329653 In MariaDB, when running in ONLY_FULL_GROUP_BY mode, the server produced in incorrect error message that there is an aggregate function without GROUP BY, for artificially created MIN/MAX functions during subquery MIN/MAX optimization. The fix introduces a way to distinguish between artifially created MIN/MAX functions as a result of a rewrite, and normal ones present in the query. The test for ONLY_FULL_GROUP_BY violation now tests in addition if a MIN/MAX function was part of a MIN/MAX subquery rewrite. In order to be able to distinguish these MIN/MAX functions, the patch introduces an additional flag in Item_in_subselect::in_strategy - SUBS_STRATEGY_CHOSEN. This flag is set when the optimizer makes its final choice of a subuqery strategy. In order to make the choice consistent, access to Item_in_subselect::in_strategy is provided via new class methods.
14 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
MDEV-12387 Push conditions into materialized subqueries The logic and the implementation scheme are similar with the MDEV-9197 Pushdown conditions into non-mergeable views/derived tables How the push down is made on the example: select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 group by x); --> select * from t1 where a>3 and b>10 and (a,b) in (select x,max(y) from t2 where x>3 group by x having max(y)>10); The implementation scheme: 1. Search for the condition cond that depends only on the fields from the left part of the IN subquery (left_part) 2. Find fields F_group in the select of the right part of the IN subquery (right_part) that are used in the GROUP BY 3. Extract from the cond condition cond_where that depends only on the fields from the left_part that stay at the same places in the left_part (have the same indexes) as the F_group fields in the projection of the right_part 4. Transform cond_where so it can be pushed into the WHERE clause of the right_part and delete cond_where from the cond 5. Transform cond so it can be pushed into the HAVING clause of the right_part The optimization is made in the Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the variable condition_pushdown_for_subquery. New test file in_subq_cond_pushdown.test is created. There are also some changes made for setup_jtbm_semi_joins(). Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins() that is called before optimize_cond() for cond and setup_jtbm_semi_joins() that is called after optimize_cond(). New setup_jtbm_semi_joins() is made in the way so that the result of its work is the same as if it was called before optimize_cond(). The code that is common for pushdown into materialized derived and into materialized IN subqueries is factored out into pushdown_cond_for_derived(), Item_in_subselect::pushdown_cond_for_in_subquery() and st_select_lex::pushdown_cond_into_where_clause().
8 years ago
  1. /*
  2. Copyright (c) 2010, 2019, MariaDB
  3. This program is free software; you can redistribute it and/or modify
  4. it under the terms of the GNU General Public License as published by
  5. the Free Software Foundation; version 2 of the License.
  6. This program is distributed in the hope that it will be useful,
  7. but WITHOUT ANY WARRANTY; without even the implied warranty of
  8. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  9. GNU General Public License for more details.
  10. You should have received a copy of the GNU General Public License
  11. along with this program; if not, write to the Free Software
  12. Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1335 USA */
  13. /**
  14. @file
  15. @brief
  16. Semi-join subquery optimizations code
  17. */
  18. #ifdef USE_PRAGMA_IMPLEMENTATION
  19. #pragma implementation // gcc: Class implementation
  20. #endif
  21. #include "mariadb.h"
  22. #include "sql_base.h"
  23. #include "sql_const.h"
  24. #include "sql_select.h"
  25. #include "filesort.h"
  26. #include "opt_subselect.h"
  27. #include "sql_test.h"
  28. #include <my_bit.h>
  29. #include "opt_trace.h"
  30. /*
  31. This file contains optimizations for semi-join subqueries.
  32. Contents
  33. --------
  34. 1. What is a semi-join subquery
  35. 2. General idea about semi-join execution
  36. 2.1 Correlated vs uncorrelated semi-joins
  37. 2.2 Mergeable vs non-mergeable semi-joins
  38. 3. Code-level view of semi-join processing
  39. 3.1 Conversion
  40. 3.1.1 Merged semi-join TABLE_LIST object
  41. 3.1.2 Non-merged semi-join data structure
  42. 3.2 Semi-joins and query optimization
  43. 3.2.1 Non-merged semi-joins and join optimization
  44. 3.2.2 Merged semi-joins and join optimization
  45. 3.3 Semi-joins and query execution
  46. 1. What is a semi-join subquery
  47. -------------------------------
  48. We use this definition of semi-join:
  49. outer_tbl SEMI JOIN inner_tbl ON cond = {set of outer_tbl.row such that
  50. exist inner_tbl.row, for which
  51. cond(outer_tbl.row,inner_tbl.row)
  52. is satisfied}
  53. That is, semi-join operation is similar to inner join operation, with
  54. exception that we don't care how many matches a row from outer_tbl has in
  55. inner_tbl.
  56. In SQL terms: a semi-join subquery is an IN subquery that is an AND-part of
  57. the WHERE/ON clause.
  58. 2. General idea about semi-join execution
  59. -----------------------------------------
  60. We can execute semi-join in a way similar to inner join, with exception that
  61. we need to somehow ensure that we do not generate record combinations that
  62. differ only in rows of inner tables.
  63. There is a number of different ways to achieve this property, implemented by
  64. a number of semi-join execution strategies.
  65. Some strategies can handle any semi-joins, other can be applied only to
  66. semi-joins that have certain properties that are described below:
  67. 2.1 Correlated vs uncorrelated semi-joins
  68. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  69. Uncorrelated semi-joins are special in the respect that they allow to
  70. - execute the subquery (possible as it's uncorrelated)
  71. - somehow make sure that generated set does not have duplicates
  72. - perform an inner join with outer tables.
  73. or, rephrasing in SQL form:
  74. SELECT ... FROM ot WHERE ot.col IN (SELECT it.col FROM it WHERE uncorr_cond)
  75. ->
  76. SELECT ... FROM ot JOIN (SELECT DISTINCT it.col FROM it WHERE uncorr_cond)
  77. 2.2 Mergeable vs non-mergeable semi-joins
  78. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  79. Semi-join operation has some degree of commutability with inner join
  80. operation: we can join subquery's tables with ouside table(s) and eliminate
  81. duplicate record combination after that:
  82. ot1 JOIN ot2 SEMI_JOIN{it1,it2} (it1 JOIN it2) ON sjcond(ot2,it*) ->
  83. |
  84. +-------------------------------+
  85. v
  86. ot1 SEMI_JOIN{it1,it2} (it1 JOIN it2 JOIN ot2) ON sjcond(ot2,it*)
  87. In order for this to work, subquery's top-level operation must be join, and
  88. grouping or ordering with limit (grouping or ordering with limit are not
  89. commutative with duplicate removal). In other words, the conversion is
  90. possible when the subquery doesn't have GROUP BY clause, any aggregate
  91. functions*, or ORDER BY ... LIMIT clause.
  92. Definitions:
  93. - Subquery whose top-level operation is a join is called *mergeable semi-join*
  94. - All other kinds of semi-join subqueries are considered non-mergeable.
  95. *- this requirement is actually too strong, but its exceptions are too
  96. complicated to be considered here.
  97. 3. Code-level view of semi-join processing
  98. ------------------------------------------
  99. 3.1 Conversion and pre-optimization data structures
  100. ---------------------------------------------------
  101. * When doing JOIN::prepare for the subquery, we detect that it can be
  102. converted into a semi-join and register it in parent_join->sj_subselects
  103. * At the start of parent_join->optimize(), the predicate is converted into
  104. a semi-join node. A semi-join node is a TABLE_LIST object that is linked
  105. somewhere in parent_join->join_list (either it is just present there, or
  106. it is a descendant of some of its members).
  107. There are two kinds of semi-joins:
  108. - Merged semi-joins
  109. - Non-merged semi-joins
  110. 3.1.1 Merged semi-join TABLE_LIST object
  111. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  112. Merged semi-join object is a TABLE_LIST that contains a sub-join of
  113. subquery tables and the semi-join ON expression (in this respect it is
  114. very similar to nested outer join representation)
  115. Merged semi-join represents this SQL:
  116. ... SEMI JOIN (inner_tbl1 JOIN ... JOIN inner_tbl_n) ON sj_on_expr
  117. Semi-join objects of this kind have TABLE_LIST::sj_subq_pred set.
  118. 3.1.2 Non-merged semi-join data structure
  119. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  120. Non-merged semi-join object is a leaf TABLE_LIST object that has a subquery
  121. that produces rows. It is similar to a base table and represents this SQL:
  122. ... SEMI_JOIN (SELECT non_mergeable_select) ON sj_on_expr
  123. Subquery items that were converted into semi-joins are removed from the WHERE
  124. clause. (They do remain in PS-saved WHERE clause, and they replace themselves
  125. with Item_int(1) on subsequent re-executions).
  126. 3.2 Semi-joins and join optimization
  127. ------------------------------------
  128. 3.2.1 Non-merged semi-joins and join optimization
  129. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  130. For join optimization purposes, non-merged semi-join nests are similar to
  131. base tables. Each such nest is represented by one one JOIN_TAB, which has
  132. two possible access strategies:
  133. - full table scan (representing SJ-Materialization-Scan strategy)
  134. - eq_ref-like table lookup (representing SJ-Materialization-Lookup)
  135. Unlike regular base tables, non-merged semi-joins have:
  136. - non-zero JOIN_TAB::startup_cost, and
  137. - join_tab->table->is_filled_at_execution()==TRUE, which means one
  138. cannot do const table detection, range analysis or other dataset-dependent
  139. optimizations.
  140. Instead, get_delayed_table_estimates() will run optimization for the
  141. subquery and produce an E(materialized table size).
  142. 3.2.2 Merged semi-joins and join optimization
  143. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  144. - optimize_semijoin_nests() does pre-optimization
  145. - during join optimization, the join has one JOIN_TAB (or is it POSITION?)
  146. array, and suffix-based detection is used, see advance_sj_state()
  147. - after join optimization is done, get_best_combination() switches
  148. the data-structure to prefix-based, multiple JOIN_TAB ranges format.
  149. 3.3 Semi-joins and query execution
  150. ----------------------------------
  151. * Join executor has hooks for all semi-join strategies.
  152. TODO elaborate.
  153. */
  154. /*
  155. EqualityPropagationAndSjmNests
  156. ******************************
  157. Equalities are used for:
  158. P1. Equality propagation
  159. P2. Equality substitution [for a certain join order]
  160. The equality propagation is not affected by SJM nests. In fact, it is done
  161. before we determine the execution plan, i.e. before we even know we will use
  162. SJM-nests for execution.
  163. The equality substitution is affected.
  164. Substitution without SJMs
  165. =========================
  166. When one doesn't have SJM nests, tables have a strict join order:
  167. --------------------------------->
  168. t1 -- t2 -- t3 -- t4 --- t5
  169. ? ^
  170. \
  171. --(part-of-WHERE)
  172. parts WHERE/ON and ref. expressions are attached at some point along the axis.
  173. Expression is allowed to refer to a table column if the table is to the left of
  174. the attachment point. For any given expression, we have a goal:
  175. "Move leftmost allowed attachment point as much as possible to the left"
  176. Substitution with SJMs - task setting
  177. =====================================
  178. When SJM nests are present, there is no global strict table ordering anymore:
  179. --------------------------------->
  180. ot1 -- ot2 --- sjm -- ot4 --- ot5
  181. |
  182. | Main execution
  183. - - - - - - - - - - - - - - - - - - - - - - - -
  184. | Materialization
  185. it1 -- it2 --/
  186. Besides that, we must take into account that
  187. - values for outer table columns, otN.col, are inaccessible at
  188. materialization step (SJM-RULE)
  189. - values for inner table columns, itN.col, are inaccessible at Main execution
  190. step, except for SJ-Materialization-Scan and columns that are in the
  191. subquery's select list. (SJM-RULE)
  192. Substitution with SJMs - solution
  193. =================================
  194. First, we introduce global strict table ordering like this:
  195. ot1 - ot2 --\ /--- ot3 -- ot5
  196. \--- it1 --- it2 --/
  197. Now, let's see how to meet (SJM-RULE).
  198. SJ-Materialization is only applicable for uncorrelated subqueries. From this, it
  199. follows that any multiple equality will either
  200. 1. include only columns of outer tables, or
  201. 2. include only columns of inner tables, or
  202. 3. include columns of inner and outer tables, joined together through one
  203. of IN-equalities.
  204. Cases #1 and #2 can be handled in the same way as with regular inner joins.
  205. Case #3 requires special handling, so that we don't construct violations of
  206. (SJM-RULE). Let's consider possible ways to build violations.
  207. Equality propagation starts with the clause in this form
  208. top_query_where AND subquery_where AND in_equalities
  209. First, it builds multi-equalities. It can also build a mixed multi-equality
  210. multiple-equal(ot1.col, ot2.col, ... it1.col, itN.col)
  211. Multi-equalities are pushed down the OR-clauses in top_query_where and in
  212. subquery_where, so it's possible that clauses like this one are built:
  213. subquery_cond OR (multiple-equal(it1.col, ot1.col,...) AND ...)
  214. ^^^^^^^^^^^^^ \
  215. | this must be evaluated
  216. \- can only be evaluated at the main phase.
  217. at the materialization phase
  218. Finally, equality substitution is started. It does two operations:
  219. 1. Field reference substitution
  220. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  221. (In the code, this is Item_field::replace_equal_field)
  222. This is a process of replacing each reference to "tblX.col"
  223. with the first element of the multi-equality. (REF-SUBST-ORIG)
  224. This behaviour can cause problems with Semi-join nests. Suppose, we have a
  225. condition:
  226. func(it1.col, it2.col)
  227. and a multi-equality(ot1.col, it1.col). Then, reference to "it1.col" will be
  228. replaced with "ot1.col", constructing a condition
  229. func(ot1.col, it2.col)
  230. which will be a violation of (SJM-RULE).
  231. In order to avoid this, (REF-SUBST-ORIG) is amended as follows:
  232. - references to tables "itX.col" that are inner wrt some SJM nest, are
  233. replaced with references to the first inner table from the same SJM nest.
  234. - references to top-level tables "otX.col" are replaced with references to
  235. the first element of the multi-equality, no matter if that first element is
  236. a column of a top-level table or of table from some SJM nest.
  237. (REF-SUBST-SJM)
  238. The case where the first element is a table from an SJM nest $SJM is ok,
  239. because it can be proven that $SJM uses SJ-Materialization-Scan, and
  240. "unpacks" correct column values to the first element during the main
  241. execution phase.
  242. 2. Item_equal elimination
  243. ~~~~~~~~~~~~~~~~~~~~~~~~~
  244. (In the code: eliminate_item_equal) This is a process of taking
  245. multiple-equal(a,b,c,d,e)
  246. and replacing it with an equivalent expression which is an AND of pair-wise
  247. equalities:
  248. a=b AND a=c AND ...
  249. The equalities are picked such that for any given join prefix (t1,t2...) the
  250. subset of equalities that can be evaluated gives the most restrictive
  251. filtering.
  252. Without SJM nests, it is sufficient to compare every multi-equality member
  253. with the first one:
  254. elem1=elem2 AND elem1=elem3 AND elem1=elem4 ...
  255. When SJM nests are present, we should take care not to construct equalities
  256. that violate the (SJM-RULE). This is achieved by generating separate sets of
  257. equalites for top-level tables and for inner tables. That is, for the join
  258. order
  259. ot1 - ot2 --\ /--- ot3 -- ot5
  260. \--- it1 --- it2 --/
  261. we will generate
  262. ot1.col=ot2.col
  263. ot1.col=ot3.col
  264. ot1.col=ot5.col
  265. it2.col=it1.col
  266. 2.1 The problem with Item_equals and ORs
  267. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  268. As has been mentioned above, multiple equalities are pushed down into OR
  269. clauses, possibly building clauses like this:
  270. func(it.col2) OR multiple-equal(it1.col1, it1.col2, ot1.col) (1)
  271. where the first part of the clause has references to inner tables, while the
  272. second has references to the top-level tables, which is a violation of
  273. (SJM-RULE).
  274. AND-clauses of this kind do not create problems, because make_cond_for_table()
  275. will take them apart. OR-clauses will not be split. It is possible to
  276. split-out the part that's dependent on the inner table:
  277. func(it.col2) OR it1.col1=it1.col2
  278. but this is a less-restrictive condition than condition (1). Current execution
  279. scheme will still try to generate the "remainder" condition:
  280. func(it.col2) OR it1.col1=ot1.col
  281. which is a violation of (SJM-RULE).
  282. QQ: "ot1.col=it1.col" is checked at the upper level. Why was it not removed
  283. here?
  284. AA: because has a proper subset of conditions that are found on this level.
  285. consider a join order of ot, sjm(it)
  286. and a condition
  287. ot.col=it.col AND ( ot.col=it.col='foo' OR it.col2='bar')
  288. we will produce:
  289. table ot: nothing
  290. table it: ot.col=it.col AND (ot.col='foo' OR it.col2='bar')
  291. ^^^^ ^^^^^^^^^^^^^^^^
  292. | \ the problem is that
  293. | this part condition didnt
  294. | receive a substitution
  295. |
  296. +--- it was correct to subst, 'ot' is
  297. the left-most.
  298. Does it make sense to push "inner=outer" down into ORs?
  299. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  300. Yes. Consider the query:
  301. select * from ot
  302. where ot.col in (select it.col from it where (it.col='foo' OR it.col='bar'))
  303. here, it may be useful to infer that
  304. (ot.col='foo' OR ot.col='bar') (CASE-FOR-SUBST)
  305. and attach that condition to the table 'ot'.
  306. Possible solutions for Item_equals and ORs
  307. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  308. Solution #1
  309. ~~~~~~~~~~~
  310. Let make_cond_for_table() chop analyze the OR clauses it has produced and
  311. discard them if they violate (SJM-RULE). This solution would allow to handle
  312. cases like (CASE-FOR-SUBST) at the expense of making semantics of
  313. make_cond_for_table() complicated.
  314. Solution #2
  315. ~~~~~~~~~~~
  316. Before the equality propagation phase, none of the OR clauses violate the
  317. (SJM-RULE). This way, if we remember which tables the original equality
  318. referred to, we can only generate equalities that refer to the outer (or inner)
  319. tables. Note that this will disallow handling of cases like (CASE-FOR-SUBST).
  320. Currently, solution #2 is implemented.
  321. */
  322. LEX_CSTRING weedout_key= {STRING_WITH_LEN("weedout_key")};
  323. static
  324. bool subquery_types_allow_materialization(THD *thd, Item_in_subselect *in_subs);
  325. static bool replace_where_subcondition(JOIN *, Item **, Item *, Item *, bool);
  326. static int subq_sj_candidate_cmp(Item_in_subselect* el1, Item_in_subselect* el2,
  327. void *arg);
  328. static void reset_equality_number_for_subq_conds(Item * cond);
  329. static bool convert_subq_to_sj(JOIN *parent_join, Item_in_subselect *subq_pred);
  330. static bool convert_subq_to_jtbm(JOIN *parent_join,
  331. Item_in_subselect *subq_pred, bool *remove);
  332. static TABLE_LIST *alloc_join_nest(THD *thd);
  333. static uint get_tmp_table_rec_length(Ref_ptr_array p_list, uint elements);
  334. bool find_eq_ref_candidate(TABLE *table, table_map sj_inner_tables);
  335. static SJ_MATERIALIZATION_INFO *
  336. at_sjmat_pos(const JOIN *join, table_map remaining_tables, const JOIN_TAB *tab,
  337. uint idx, bool *loose_scan);
  338. void best_access_path(JOIN *join, JOIN_TAB *s,
  339. table_map remaining_tables, uint idx,
  340. bool disable_jbuf, double record_count,
  341. POSITION *pos, POSITION *loose_scan_pos);
  342. void trace_plan_prefix(JOIN *join, uint idx, table_map remaining_tables);
  343. static Item *create_subq_in_equalities(THD *thd, SJ_MATERIALIZATION_INFO *sjm,
  344. Item_in_subselect *subq_pred);
  345. static bool remove_sj_conds(THD *thd, Item **tree);
  346. static bool is_cond_sj_in_equality(Item *item);
  347. static bool sj_table_is_included(JOIN *join, JOIN_TAB *join_tab);
  348. static Item *remove_additional_cond(Item* conds);
  349. static void remove_subq_pushed_predicates(JOIN *join, Item **where);
  350. enum_nested_loop_state
  351. end_sj_materialize(JOIN *join, JOIN_TAB *join_tab, bool end_of_records);
  352. /*
  353. Check if Materialization strategy is allowed for given subquery predicate.
  354. @param thd Thread handle
  355. @param in_subs The subquery predicate
  356. @param child_select The select inside predicate (the function will
  357. check it is the only one)
  358. @return TRUE - Materialization is applicable
  359. FALSE - Otherwise
  360. */
  361. bool is_materialization_applicable(THD *thd, Item_in_subselect *in_subs,
  362. st_select_lex *child_select)
  363. {
  364. st_select_lex_unit* parent_unit= child_select->master_unit();
  365. /*
  366. Check if the subquery predicate can be executed via materialization.
  367. The required conditions are:
  368. 0. The materialization optimizer switch was set.
  369. 1. Subquery is a single SELECT (not a UNION).
  370. TODO: this is a limitation that can be fixed
  371. 2. Subquery is not a table-less query. In this case there is no
  372. point in materializing.
  373. 2A The upper query is not a table-less SELECT ... FROM DUAL. We
  374. can't do materialization for SELECT .. FROM DUAL because it
  375. does not call setup_subquery_materialization(). We could make
  376. SELECT ... FROM DUAL call that function but that doesn't seem
  377. to be the case that is worth handling.
  378. 3. Either the subquery predicate is a top-level predicate, or at
  379. least one partial match strategy is enabled. If no partial match
  380. strategy is enabled, then materialization cannot be used for
  381. non-top-level queries because it cannot handle NULLs correctly.
  382. 4. Subquery is non-correlated
  383. TODO:
  384. This condition is too restrictive (limitation). It can be extended to:
  385. (Subquery is non-correlated ||
  386. Subquery is correlated to any query outer to IN predicate ||
  387. (Subquery is correlated to the immediate outer query &&
  388. Subquery !contains {GROUP BY, ORDER BY [LIMIT],
  389. aggregate functions}) && subquery predicate is not under "NOT IN"))
  390. 5. Subquery does not contain recursive references
  391. A note about prepared statements: we want the if-branch to be taken on
  392. PREPARE and each EXECUTE. The rewrites are only done once, but we need
  393. select_lex->sj_subselects list to be populated for every EXECUTE.
  394. */
  395. if (optimizer_flag(thd, OPTIMIZER_SWITCH_MATERIALIZATION) && // 0
  396. !child_select->is_part_of_union() && // 1
  397. parent_unit->first_select()->leaf_tables.elements && // 2
  398. child_select->outer_select() &&
  399. child_select->outer_select()->table_list.first && // 2A
  400. subquery_types_allow_materialization(thd, in_subs) &&
  401. (in_subs->is_top_level_item() || //3
  402. optimizer_flag(thd,
  403. OPTIMIZER_SWITCH_PARTIAL_MATCH_ROWID_MERGE) || //3
  404. optimizer_flag(thd,
  405. OPTIMIZER_SWITCH_PARTIAL_MATCH_TABLE_SCAN)) && //3
  406. !in_subs->is_correlated && //4
  407. !in_subs->with_recursive_reference) //5
  408. {
  409. return TRUE;
  410. }
  411. return FALSE;
  412. }
  413. /*
  414. Check if we need JOIN::prepare()-phase subquery rewrites and if yes, do them
  415. SYNOPSIS
  416. check_and_do_in_subquery_rewrites()
  417. join Subquery's join
  418. DESCRIPTION
  419. Check if we need to do
  420. - subquery -> mergeable semi-join rewrite
  421. - if the subquery can be handled with materialization
  422. - 'substitution' rewrite for table-less subqueries like "(select 1)"
  423. - IN->EXISTS rewrite
  424. and, depending on the rewrite, either do it, or record it to be done at a
  425. later phase.
  426. RETURN
  427. 0 - OK
  428. Other - Some sort of query error
  429. */
  430. int check_and_do_in_subquery_rewrites(JOIN *join)
  431. {
  432. THD *thd=join->thd;
  433. st_select_lex *select_lex= join->select_lex;
  434. st_select_lex_unit* parent_unit= select_lex->master_unit();
  435. DBUG_ENTER("check_and_do_in_subquery_rewrites");
  436. /*
  437. IN/ALL/ANY rewrites are not applicable for so called fake select
  438. (this select exists only to filter results of union if it is needed).
  439. */
  440. if (select_lex == select_lex->master_unit()->fake_select_lex)
  441. DBUG_RETURN(0);
  442. /*
  443. If
  444. 1) this join is inside a subquery (of any type except FROM-clause
  445. subquery) and
  446. 2) we aren't just normalizing a VIEW
  447. Then perform early unconditional subquery transformations:
  448. - Convert subquery predicate into semi-join, or
  449. - Mark the subquery for execution using materialization, or
  450. - Perform IN->EXISTS transformation, or
  451. - Perform more/less ALL/ANY -> MIN/MAX rewrite
  452. - Substitute trivial scalar-context subquery with its value
  453. TODO: for PS, make the whole block execute only on the first execution
  454. */
  455. Item_subselect *subselect;
  456. if (!thd->lex->is_view_context_analysis() && // (1)
  457. (subselect= parent_unit->item)) // (2)
  458. {
  459. Item_in_subselect *in_subs= NULL;
  460. Item_allany_subselect *allany_subs= NULL;
  461. Item_subselect::subs_type substype= subselect->substype();
  462. switch (substype) {
  463. case Item_subselect::IN_SUBS:
  464. in_subs= (Item_in_subselect *)subselect;
  465. break;
  466. case Item_subselect::ALL_SUBS:
  467. case Item_subselect::ANY_SUBS:
  468. allany_subs= (Item_allany_subselect *)subselect;
  469. break;
  470. default:
  471. break;
  472. }
  473. /*
  474. Try removing "ORDER BY" or even "ORDER BY ... LIMIT" from certain kinds
  475. of subqueries. The removal might enable further transformations.
  476. */
  477. if (substype == Item_subselect::IN_SUBS ||
  478. substype == Item_subselect::EXISTS_SUBS ||
  479. substype == Item_subselect::ANY_SUBS ||
  480. substype == Item_subselect::ALL_SUBS)
  481. {
  482. // (1) - ORDER BY without LIMIT can be removed from IN/EXISTS subqueries
  483. // (2) - for EXISTS, can also remove "ORDER BY ... LIMIT n",
  484. // but cannot remove "ORDER BY ... LIMIT n OFFSET m"
  485. if (!select_lex->select_limit || // (1)
  486. (substype == Item_subselect::EXISTS_SUBS && // (2)
  487. !select_lex->offset_limit)) // (2)
  488. {
  489. select_lex->join->order= 0;
  490. select_lex->join->skip_sort_order= 1;
  491. }
  492. }
  493. /* Resolve expressions and perform semantic analysis for IN query */
  494. if (in_subs != NULL)
  495. /*
  496. TODO: Add the condition below to this if statement when we have proper
  497. support for is_correlated handling for materialized semijoins.
  498. If we were to add this condition now, the fix_fields() call in
  499. convert_subq_to_sj() would force the flag is_correlated to be set
  500. erroneously for prepared queries.
  501. thd->stmt_arena->state != Query_arena::PREPARED)
  502. */
  503. {
  504. SELECT_LEX *current= thd->lex->current_select;
  505. thd->lex->current_select= current->return_after_parsing();
  506. char const *save_where= thd->where;
  507. thd->where= "IN/ALL/ANY subquery";
  508. bool failure= in_subs->left_expr->fix_fields_if_needed(thd,
  509. &in_subs->left_expr);
  510. thd->lex->current_select= current;
  511. thd->where= save_where;
  512. if (failure)
  513. DBUG_RETURN(-1); /* purecov: deadcode */
  514. /*
  515. Check if the left and right expressions have the same # of
  516. columns, i.e. we don't have a case like
  517. (oe1, oe2) IN (SELECT ie1, ie2, ie3 ...)
  518. TODO why do we have this duplicated in IN->EXISTS transformers?
  519. psergey-todo: fix these: grep for duplicated_subselect_card_check
  520. */
  521. if (select_lex->item_list.elements != in_subs->left_expr->cols())
  522. {
  523. my_error(ER_OPERAND_COLUMNS, MYF(0), in_subs->left_expr->cols());
  524. DBUG_RETURN(-1);
  525. }
  526. }
  527. DBUG_PRINT("info", ("Checking if subq can be converted to semi-join"));
  528. /*
  529. Check if we're in subquery that is a candidate for flattening into a
  530. semi-join (which is done in flatten_subqueries()). The
  531. requirements are:
  532. 1. Subquery predicate is an IN/=ANY subq predicate
  533. 2. Subquery is a single SELECT (not a UNION)
  534. 3. Subquery does not have GROUP BY or ORDER BY
  535. 4. Subquery does not use aggregate functions or HAVING
  536. 5. Subquery predicate is at the AND-top-level of ON/WHERE clause
  537. 6. We are not in a subquery of a single table UPDATE/DELETE that
  538. doesn't have a JOIN (TODO: We should handle this at some
  539. point by switching to multi-table UPDATE/DELETE)
  540. 7. We're not in a table-less subquery like "SELECT 1"
  541. 8. No execution method was already chosen (by a prepared statement)
  542. 9. Parent select is not a table-less select
  543. 10. Neither parent nor child select have STRAIGHT_JOIN option.
  544. 11. It is first optimisation (the subquery could be moved from ON
  545. clause during first optimisation and then be considered for SJ
  546. on the second when it is too late)
  547. */
  548. if (optimizer_flag(thd, OPTIMIZER_SWITCH_SEMIJOIN) &&
  549. in_subs && // 1
  550. !select_lex->is_part_of_union() && // 2
  551. !select_lex->group_list.elements && !join->order && // 3
  552. !join->having && !select_lex->with_sum_func && // 4
  553. in_subs->emb_on_expr_nest && // 5
  554. select_lex->outer_select()->join && // 6
  555. parent_unit->first_select()->leaf_tables.elements && // 7
  556. !in_subs->has_strategy() && // 8
  557. select_lex->outer_select()->table_list.first && // 9
  558. !((join->select_options | // 10
  559. select_lex->outer_select()->join->select_options) // 10
  560. & SELECT_STRAIGHT_JOIN) && // 10
  561. select_lex->first_cond_optimization) // 11
  562. {
  563. DBUG_PRINT("info", ("Subquery is semi-join conversion candidate"));
  564. (void)subquery_types_allow_materialization(thd, in_subs);
  565. in_subs->is_flattenable_semijoin= TRUE;
  566. /* Register the subquery for further processing in flatten_subqueries() */
  567. if (!in_subs->is_registered_semijoin)
  568. {
  569. Query_arena *arena, backup;
  570. arena= thd->activate_stmt_arena_if_needed(&backup);
  571. select_lex->outer_select()->sj_subselects.push_back(in_subs,
  572. thd->mem_root);
  573. if (arena)
  574. thd->restore_active_arena(arena, &backup);
  575. in_subs->is_registered_semijoin= TRUE;
  576. OPT_TRACE_TRANSFORM(thd, trace_wrapper, trace_transform,
  577. select_lex->select_number,
  578. "IN (SELECT)", "semijoin");
  579. trace_transform.add("chosen", true);
  580. }
  581. }
  582. else
  583. {
  584. DBUG_PRINT("info", ("Subquery can't be converted to merged semi-join"));
  585. /* Test if the user has set a legal combination of optimizer switches. */
  586. if (!optimizer_flag(thd, OPTIMIZER_SWITCH_IN_TO_EXISTS) &&
  587. !optimizer_flag(thd, OPTIMIZER_SWITCH_MATERIALIZATION))
  588. my_error(ER_ILLEGAL_SUBQUERY_OPTIMIZER_SWITCHES, MYF(0));
  589. /*
  590. Transform each subquery predicate according to its overloaded
  591. transformer.
  592. */
  593. if (subselect->select_transformer(join))
  594. DBUG_RETURN(-1);
  595. /*
  596. If the subquery predicate is IN/=ANY, analyse and set all possible
  597. subquery execution strategies based on optimizer switches and syntactic
  598. properties.
  599. */
  600. if (in_subs && !in_subs->has_strategy())
  601. {
  602. if (is_materialization_applicable(thd, in_subs, select_lex))
  603. {
  604. in_subs->add_strategy(SUBS_MATERIALIZATION);
  605. /*
  606. If the subquery is an AND-part of WHERE register for being processed
  607. with jtbm strategy
  608. */
  609. if (in_subs->emb_on_expr_nest == NO_JOIN_NEST &&
  610. optimizer_flag(thd, OPTIMIZER_SWITCH_SEMIJOIN))
  611. {
  612. in_subs->is_flattenable_semijoin= FALSE;
  613. if (!in_subs->is_registered_semijoin)
  614. {
  615. Query_arena *arena, backup;
  616. arena= thd->activate_stmt_arena_if_needed(&backup);
  617. select_lex->outer_select()->sj_subselects.push_back(in_subs,
  618. thd->mem_root);
  619. if (arena)
  620. thd->restore_active_arena(arena, &backup);
  621. in_subs->is_registered_semijoin= TRUE;
  622. }
  623. }
  624. }
  625. /*
  626. IN-TO-EXISTS is the only universal strategy. Choose it if the user
  627. allowed it via an optimizer switch, or if materialization is not
  628. possible.
  629. */
  630. if (optimizer_flag(thd, OPTIMIZER_SWITCH_IN_TO_EXISTS) ||
  631. !in_subs->has_strategy())
  632. in_subs->add_strategy(SUBS_IN_TO_EXISTS);
  633. }
  634. /* Check if max/min optimization applicable */
  635. if (allany_subs && !allany_subs->is_set_strategy())
  636. {
  637. uchar strategy= (allany_subs->is_maxmin_applicable(join) ?
  638. (SUBS_MAXMIN_INJECTED | SUBS_MAXMIN_ENGINE) :
  639. SUBS_IN_TO_EXISTS);
  640. allany_subs->add_strategy(strategy);
  641. }
  642. }
  643. }
  644. DBUG_RETURN(0);
  645. }
  646. /**
  647. @brief Check if subquery's compared types allow materialization.
  648. @param in_subs Subquery predicate, updated as follows:
  649. types_allow_materialization TRUE if subquery materialization is allowed.
  650. sjm_scan_allowed If types_allow_materialization is TRUE,
  651. indicates whether it is possible to use subquery
  652. materialization and scan the materialized table.
  653. @retval TRUE If subquery types allow materialization.
  654. @retval FALSE Otherwise.
  655. @details
  656. This is a temporary fix for BUG#36752.
  657. There are two subquery materialization strategies:
  658. 1. Materialize and do index lookups in the materialized table. See
  659. BUG#36752 for description of restrictions we need to put on the
  660. compared expressions.
  661. 2. Materialize and then do a full scan of the materialized table. At the
  662. moment, this strategy's applicability criteria are even stricter than
  663. in #1.
  664. This is so because of the following: consider an uncorrelated subquery
  665. ...WHERE (ot1.col1, ot2.col2 ...) IN (SELECT ie1,ie2,... FROM it1 ...)
  666. and a join order that could be used to do sjm-materialization:
  667. SJM-Scan(it1, it1), ot1, ot2
  668. IN-equalities will be parts of conditions attached to the outer tables:
  669. ot1: ot1.col1 = ie1 AND ... (C1)
  670. ot2: ot1.col2 = ie2 AND ... (C2)
  671. besides those there may be additional references to ie1 and ie2
  672. generated by equality propagation. The problem with evaluating C1 and
  673. C2 is that ie{1,2} refer to subquery tables' columns, while we only have
  674. current value of materialization temptable. Our solution is to
  675. * require that all ie{N} are table column references. This allows
  676. to copy the values of materialization temptable columns to the
  677. original table's columns (see setup_sj_materialization for more
  678. details)
  679. * require that compared columns have exactly the same type. This is
  680. a temporary measure to avoid BUG#36752-type problems.
  681. JOIN_TAB::keyuse_is_valid_for_access_in_chosen_plan expects that for Semi Join Materialization
  682. Scan all the items in the select list of the IN Subquery are of the type Item::FIELD_ITEM.
  683. */
  684. static
  685. bool subquery_types_allow_materialization(THD* thd, Item_in_subselect *in_subs)
  686. {
  687. DBUG_ENTER("subquery_types_allow_materialization");
  688. DBUG_ASSERT(in_subs->left_expr->is_fixed());
  689. List_iterator<Item> it(in_subs->unit->first_select()->item_list);
  690. uint elements= in_subs->unit->first_select()->item_list.elements;
  691. const char* cause= NULL;
  692. in_subs->types_allow_materialization= FALSE; // Assign default values
  693. in_subs->sjm_scan_allowed= FALSE;
  694. OPT_TRACE_TRANSFORM(thd, trace_wrapper, trace_transform,
  695. in_subs->get_select_lex()->select_number,
  696. "IN (SELECT)", "materialization");
  697. bool all_are_fields= TRUE;
  698. uint32 total_key_length = 0;
  699. for (uint i= 0; i < elements; i++)
  700. {
  701. Item *outer= in_subs->left_expr->element_index(i);
  702. Item *inner= it++;
  703. all_are_fields &= (outer->real_item()->type() == Item::FIELD_ITEM &&
  704. inner->real_item()->type() == Item::FIELD_ITEM);
  705. total_key_length += inner->max_length;
  706. if (!inner->type_handler()->subquery_type_allows_materialization(inner,
  707. outer))
  708. {
  709. trace_transform.add("possible", false);
  710. trace_transform.add("cause", "types mismatch");
  711. DBUG_RETURN(FALSE);
  712. }
  713. }
  714. /*
  715. Make sure that create_tmp_table will not fail due to too long keys.
  716. See MDEV-7122. This check is performed inside create_tmp_table also and
  717. we must do it so that we know the table has keys created.
  718. Make sure that the length of the key for the temp_table is atleast
  719. greater than 0.
  720. */
  721. if (!total_key_length)
  722. cause= "zero length key for materialized table";
  723. else if (total_key_length > tmp_table_max_key_length())
  724. cause= "length of key greater than allowed key length for materialized tables";
  725. else if (elements > tmp_table_max_key_parts())
  726. cause= "#keyparts greater than allowed key parts for materialized tables";
  727. else
  728. {
  729. in_subs->types_allow_materialization= TRUE;
  730. in_subs->sjm_scan_allowed= all_are_fields;
  731. trace_transform.add("sjm_scan_allowed", all_are_fields)
  732. .add("possible", true);
  733. DBUG_PRINT("info",("subquery_types_allow_materialization: ok, allowed"));
  734. DBUG_RETURN(TRUE);
  735. }
  736. trace_transform.add("possible", false).add("cause", cause);
  737. DBUG_RETURN(FALSE);
  738. }
  739. /**
  740. Apply max min optimization of all/any subselect
  741. */
  742. bool JOIN::transform_max_min_subquery()
  743. {
  744. DBUG_ENTER("JOIN::transform_max_min_subquery");
  745. Item_subselect *subselect= unit->item;
  746. if (!subselect || (subselect->substype() != Item_subselect::ALL_SUBS &&
  747. subselect->substype() != Item_subselect::ANY_SUBS))
  748. DBUG_RETURN(0);
  749. DBUG_RETURN(((Item_allany_subselect *) subselect)->
  750. transform_into_max_min(this));
  751. }
  752. /*
  753. Finalize IN->EXISTS conversion in case we couldn't use materialization.
  754. DESCRIPTION Invoke the IN->EXISTS converter
  755. Replace the Item_in_subselect with its wrapper Item_in_optimizer in WHERE.
  756. RETURN
  757. FALSE - Ok
  758. TRUE - Fatal error
  759. */
  760. bool make_in_exists_conversion(THD *thd, JOIN *join, Item_in_subselect *item)
  761. {
  762. DBUG_ENTER("make_in_exists_conversion");
  763. JOIN *child_join= item->unit->first_select()->join;
  764. bool res;
  765. /*
  766. We're going to finalize IN->EXISTS conversion.
  767. Normally, IN->EXISTS conversion takes place inside the
  768. Item_subselect::fix_fields() call, where item_subselect->is_fixed()==FALSE (as
  769. fix_fields() haven't finished yet) and item_subselect->changed==FALSE (as
  770. the conversion haven't been finalized)
  771. At the end of Item_subselect::fix_fields() we had to set fixed=TRUE,
  772. changed=TRUE (the only other option would have been to return error).
  773. So, now we have to set these back for the duration of select_transformer()
  774. call.
  775. */
  776. item->changed= 0;
  777. item->fixed= 0;
  778. SELECT_LEX *save_select_lex= thd->lex->current_select;
  779. thd->lex->current_select= item->unit->first_select();
  780. res= item->select_transformer(child_join);
  781. thd->lex->current_select= save_select_lex;
  782. if (res)
  783. DBUG_RETURN(TRUE);
  784. item->changed= 1;
  785. item->fixed= 1;
  786. Item *substitute= item->substitution;
  787. bool do_fix_fields= !item->substitution->is_fixed();
  788. /*
  789. The Item_subselect has already been wrapped with Item_in_optimizer, so we
  790. should search for item->optimizer, not 'item'.
  791. */
  792. Item *replace_me= item->optimizer;
  793. DBUG_ASSERT(replace_me==substitute);
  794. Item **tree= (item->emb_on_expr_nest == NO_JOIN_NEST)?
  795. &join->conds : &(item->emb_on_expr_nest->on_expr);
  796. if (replace_where_subcondition(join, tree, replace_me, substitute,
  797. do_fix_fields))
  798. DBUG_RETURN(TRUE);
  799. item->substitution= NULL;
  800. /*
  801. If this is a prepared statement, repeat the above operation for
  802. prep_where (or prep_on_expr).
  803. */
  804. if (!thd->stmt_arena->is_conventional())
  805. {
  806. tree= (item->emb_on_expr_nest == (TABLE_LIST*)NO_JOIN_NEST)?
  807. &join->select_lex->prep_where :
  808. &(item->emb_on_expr_nest->prep_on_expr);
  809. if (replace_where_subcondition(join, tree, replace_me, substitute,
  810. FALSE))
  811. DBUG_RETURN(TRUE);
  812. }
  813. DBUG_RETURN(FALSE);
  814. }
  815. bool check_for_outer_joins(List<TABLE_LIST> *join_list)
  816. {
  817. TABLE_LIST *table;
  818. NESTED_JOIN *nested_join;
  819. List_iterator<TABLE_LIST> li(*join_list);
  820. while ((table= li++))
  821. {
  822. if ((nested_join= table->nested_join))
  823. {
  824. if (check_for_outer_joins(&nested_join->join_list))
  825. return TRUE;
  826. }
  827. if (table->outer_join)
  828. return TRUE;
  829. }
  830. return FALSE;
  831. }
  832. void find_and_block_conversion_to_sj(Item *to_find,
  833. List_iterator_fast<Item_in_subselect> &li)
  834. {
  835. if (to_find->type() == Item::FUNC_ITEM &&
  836. ((Item_func*)to_find)->functype() == Item_func::IN_OPTIMIZER_FUNC)
  837. to_find= ((Item_in_optimizer*)to_find)->get_wrapped_in_subselect_item();
  838. if (to_find->type() != Item::SUBSELECT_ITEM ||
  839. ((Item_subselect *) to_find)->substype() != Item_subselect::IN_SUBS)
  840. return;
  841. Item_in_subselect *in_subq;
  842. li.rewind();
  843. while ((in_subq= li++))
  844. {
  845. if (in_subq == to_find)
  846. {
  847. in_subq->block_conversion_to_sj();
  848. return;
  849. }
  850. }
  851. }
  852. /*
  853. Convert semi-join subquery predicates into semi-join join nests
  854. SYNOPSIS
  855. convert_join_subqueries_to_semijoins()
  856. DESCRIPTION
  857. Convert candidate subquery predicates into semi-join join nests. This
  858. transformation is performed once in query lifetime and is irreversible.
  859. Conversion of one subquery predicate
  860. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  861. We start with a join that has a semi-join subquery:
  862. SELECT ...
  863. FROM ot, ...
  864. WHERE oe IN (SELECT ie FROM it1 ... itN WHERE subq_where) AND outer_where
  865. and convert it into a semi-join nest:
  866. SELECT ...
  867. FROM ot SEMI JOIN (it1 ... itN), ...
  868. WHERE outer_where AND subq_where AND oe=ie
  869. that is, in order to do the conversion, we need to
  870. * Create the "SEMI JOIN (it1 .. itN)" part and add it into the parent
  871. query's FROM structure.
  872. * Add "AND subq_where AND oe=ie" into parent query's WHERE (or ON if
  873. the subquery predicate was in an ON expression)
  874. * Remove the subquery predicate from the parent query's WHERE
  875. Considerations when converting many predicates
  876. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  877. A join may have at most MAX_TABLES tables. This may prevent us from
  878. flattening all subqueries when the total number of tables in parent and
  879. child selects exceeds MAX_TABLES.
  880. We deal with this problem by flattening children's subqueries first and
  881. then using a heuristic rule to determine each subquery predicate's
  882. "priority".
  883. RETURN
  884. FALSE OK
  885. TRUE Error
  886. */
  887. bool convert_join_subqueries_to_semijoins(JOIN *join)
  888. {
  889. Query_arena *arena, backup;
  890. Item_in_subselect *in_subq;
  891. THD *thd= join->thd;
  892. DBUG_ENTER("convert_join_subqueries_to_semijoins");
  893. if (join->select_lex->sj_subselects.is_empty())
  894. DBUG_RETURN(FALSE);
  895. List_iterator_fast<Item_in_subselect> li(join->select_lex->sj_subselects);
  896. while ((in_subq= li++))
  897. {
  898. SELECT_LEX *subq_sel= in_subq->get_select_lex();
  899. if (subq_sel->handle_derived(thd->lex, DT_MERGE))
  900. DBUG_RETURN(TRUE);
  901. if (subq_sel->join->transform_in_predicates_into_in_subq(thd))
  902. DBUG_RETURN(TRUE);
  903. subq_sel->update_used_tables();
  904. }
  905. /*
  906. Check all candidates to semi-join conversion that occur
  907. in ON expressions of outer join. Set the flag blocking
  908. this conversion for them.
  909. */
  910. TABLE_LIST *tbl;
  911. List_iterator<TABLE_LIST> ti(join->select_lex->leaf_tables);
  912. while ((tbl= ti++))
  913. {
  914. TABLE_LIST *embedded;
  915. TABLE_LIST *embedding= tbl;
  916. do
  917. {
  918. embedded= embedding;
  919. bool block_conversion_to_sj= false;
  920. if (embedded->on_expr)
  921. {
  922. /*
  923. Conversion of an IN subquery predicate into semi-join
  924. is blocked now if the predicate occurs:
  925. - in the ON expression of an outer join
  926. - in the ON expression of an inner join embedded directly
  927. or indirectly in the inner nest of an outer join
  928. */
  929. for (TABLE_LIST *tl= embedded; tl; tl= tl->embedding)
  930. {
  931. if (tl->outer_join)
  932. {
  933. block_conversion_to_sj= true;
  934. break;
  935. }
  936. }
  937. }
  938. if (block_conversion_to_sj)
  939. {
  940. Item *cond= embedded->on_expr;
  941. if (!cond)
  942. ;
  943. else if (cond->type() != Item::COND_ITEM)
  944. find_and_block_conversion_to_sj(cond, li);
  945. else if (((Item_cond*) cond)->functype() ==
  946. Item_func::COND_AND_FUNC)
  947. {
  948. Item *item;
  949. List_iterator<Item> it(*(((Item_cond*) cond)->argument_list()));
  950. while ((item= it++))
  951. {
  952. find_and_block_conversion_to_sj(item, li);
  953. }
  954. }
  955. }
  956. embedding= embedded->embedding;
  957. }
  958. while (embedding &&
  959. embedding->nested_join->join_list.head() == embedded);
  960. }
  961. /*
  962. Block conversion to semi-joins for those candidates that
  963. are encountered in the WHERE condition of the multi-table view
  964. with CHECK OPTION if this view is used in UPDATE/DELETE.
  965. (This limitation can be, probably, easily lifted.)
  966. */
  967. li.rewind();
  968. while ((in_subq= li++))
  969. {
  970. if (in_subq->emb_on_expr_nest != NO_JOIN_NEST &&
  971. in_subq->emb_on_expr_nest->effective_with_check)
  972. {
  973. in_subq->block_conversion_to_sj();
  974. }
  975. }
  976. if (join->select_options & SELECT_STRAIGHT_JOIN)
  977. {
  978. /* Block conversion to semijoins for all candidates */
  979. li.rewind();
  980. while ((in_subq= li++))
  981. {
  982. in_subq->block_conversion_to_sj();
  983. }
  984. }
  985. li.rewind();
  986. /* First, convert child join's subqueries. We proceed bottom-up here */
  987. while ((in_subq= li++))
  988. {
  989. st_select_lex *child_select= in_subq->get_select_lex();
  990. JOIN *child_join= child_select->join;
  991. child_join->outer_tables = child_join->table_count;
  992. /*
  993. child_select->where contains only the WHERE predicate of the
  994. subquery itself here. We may be selecting from a VIEW, which has its
  995. own predicate. The combined predicates are available in child_join->conds,
  996. which was built by setup_conds() doing prepare_where() for all views.
  997. */
  998. child_select->where= child_join->conds;
  999. if (convert_join_subqueries_to_semijoins(child_join))
  1000. DBUG_RETURN(TRUE);
  1001. in_subq->sj_convert_priority=
  1002. MY_TEST(in_subq->do_not_convert_to_sj) * MAX_TABLES * 2 +
  1003. in_subq->is_correlated * MAX_TABLES + child_join->outer_tables;
  1004. }
  1005. // Temporary measure: disable semi-joins when they are together with outer
  1006. // joins.
  1007. #if 0
  1008. if (check_for_outer_joins(join->join_list))
  1009. {
  1010. in_subq= join->select_lex->sj_subselects.head();
  1011. arena= thd->activate_stmt_arena_if_needed(&backup);
  1012. goto skip_conversion;
  1013. }
  1014. #endif
  1015. //dump_TABLE_LIST_struct(select_lex, select_lex->leaf_tables);
  1016. /*
  1017. 2. Pick which subqueries to convert:
  1018. sort the subquery array
  1019. - prefer correlated subqueries over uncorrelated;
  1020. - prefer subqueries that have greater number of outer tables;
  1021. */
  1022. bubble_sort<Item_in_subselect>(&join->select_lex->sj_subselects,
  1023. subq_sj_candidate_cmp, NULL);
  1024. // #tables-in-parent-query + #tables-in-subquery < MAX_TABLES
  1025. /* Replace all subqueries to be flattened with Item_int(1) */
  1026. arena= thd->activate_stmt_arena_if_needed(&backup);
  1027. li.rewind();
  1028. while ((in_subq= li++))
  1029. {
  1030. bool remove_item= TRUE;
  1031. /* Stop processing if we've reached a subquery that's attached to the ON clause */
  1032. if (in_subq->do_not_convert_to_sj)
  1033. {
  1034. OPT_TRACE_TRANSFORM(thd, trace_wrapper, trace_transform,
  1035. in_subq->get_select_lex()->select_number,
  1036. "IN (SELECT)", "semijoin");
  1037. trace_transform.add("converted_to_semi_join", false)
  1038. .add("cause", "subquery attached to the ON clause");
  1039. break;
  1040. }
  1041. if (in_subq->is_flattenable_semijoin)
  1042. {
  1043. OPT_TRACE_TRANSFORM(thd, trace_wrapper, trace_transform,
  1044. in_subq->get_select_lex()->select_number,
  1045. "IN (SELECT)", "semijoin");
  1046. if (join->table_count +
  1047. in_subq->unit->first_select()->join->table_count >= MAX_TABLES)
  1048. {
  1049. trace_transform.add("converted_to_semi_join", false);
  1050. trace_transform.add("cause",
  1051. "table in parent join now exceeds MAX_TABLES");
  1052. break;
  1053. }
  1054. if (convert_subq_to_sj(join, in_subq))
  1055. goto restore_arena_and_fail;
  1056. trace_transform.add("converted_to_semi_join", true);
  1057. }
  1058. else
  1059. {
  1060. if (join->table_count + 1 >= MAX_TABLES)
  1061. break;
  1062. if (convert_subq_to_jtbm(join, in_subq, &remove_item))
  1063. goto restore_arena_and_fail;
  1064. }
  1065. if (remove_item)
  1066. {
  1067. Item **tree= (in_subq->emb_on_expr_nest == NO_JOIN_NEST)?
  1068. &join->conds : &(in_subq->emb_on_expr_nest->on_expr);
  1069. Item *replace_me= in_subq->original_item();
  1070. if (replace_where_subcondition(join, tree, replace_me,
  1071. new (thd->mem_root) Item_int(thd, 1),
  1072. FALSE))
  1073. goto restore_arena_and_fail;
  1074. }
  1075. }
  1076. //skip_conversion:
  1077. /*
  1078. 3. Finalize (perform IN->EXISTS rewrite) the subqueries that we didn't
  1079. convert:
  1080. */
  1081. while (in_subq)
  1082. {
  1083. JOIN *child_join= in_subq->unit->first_select()->join;
  1084. in_subq->changed= 0;
  1085. in_subq->fixed= 0;
  1086. SELECT_LEX *save_select_lex= thd->lex->current_select;
  1087. thd->lex->current_select= in_subq->unit->first_select();
  1088. bool res= in_subq->select_transformer(child_join);
  1089. thd->lex->current_select= save_select_lex;
  1090. if (res)
  1091. DBUG_RETURN(TRUE);
  1092. in_subq->changed= 1;
  1093. in_subq->fixed= 1;
  1094. Item *substitute= in_subq->substitution;
  1095. bool do_fix_fields= !in_subq->substitution->is_fixed();
  1096. Item **tree= (in_subq->emb_on_expr_nest == NO_JOIN_NEST)?
  1097. &join->conds : &(in_subq->emb_on_expr_nest->on_expr);
  1098. Item *replace_me= in_subq->original_item();
  1099. if (replace_where_subcondition(join, tree, replace_me, substitute,
  1100. do_fix_fields))
  1101. DBUG_RETURN(TRUE);
  1102. in_subq->substitution= NULL;
  1103. /*
  1104. If this is a prepared statement, repeat the above operation for
  1105. prep_where (or prep_on_expr). Subquery-to-semijoin conversion is
  1106. done once for prepared statement.
  1107. */
  1108. if (!thd->stmt_arena->is_conventional())
  1109. {
  1110. tree= (in_subq->emb_on_expr_nest == NO_JOIN_NEST)?
  1111. &join->select_lex->prep_where :
  1112. &(in_subq->emb_on_expr_nest->prep_on_expr);
  1113. /*
  1114. prep_on_expr/ prep_where may be NULL in some cases.
  1115. If that is the case, do nothing - simplify_joins() will copy
  1116. ON/WHERE expression into prep_on_expr/prep_where.
  1117. */
  1118. if (*tree && replace_where_subcondition(join, tree, replace_me, substitute,
  1119. FALSE))
  1120. DBUG_RETURN(TRUE);
  1121. }
  1122. /*
  1123. Revert to the IN->EXISTS strategy in the rare case when the subquery could
  1124. not be flattened.
  1125. */
  1126. in_subq->reset_strategy(SUBS_IN_TO_EXISTS);
  1127. if (is_materialization_applicable(thd, in_subq,
  1128. in_subq->unit->first_select()))
  1129. {
  1130. in_subq->add_strategy(SUBS_MATERIALIZATION);
  1131. }
  1132. in_subq= li++;
  1133. }
  1134. if (arena)
  1135. thd->restore_active_arena(arena, &backup);
  1136. join->select_lex->sj_subselects.empty();
  1137. DBUG_RETURN(FALSE);
  1138. restore_arena_and_fail:
  1139. if (arena)
  1140. thd->restore_active_arena(arena, &backup);
  1141. DBUG_RETURN(TRUE);
  1142. }
  1143. /*
  1144. Get #output_rows and scan_time estimates for a "delayed" table.
  1145. SYNOPSIS
  1146. get_delayed_table_estimates()
  1147. table IN Table to get estimates for
  1148. out_rows OUT E(#rows in the table)
  1149. scan_time OUT E(scan_time).
  1150. startup_cost OUT cost to populate the table.
  1151. DESCRIPTION
  1152. Get #output_rows and scan_time estimates for a "delayed" table. By
  1153. "delayed" here we mean that the table is filled at the start of query
  1154. execution. This means that the optimizer can't use table statistics to
  1155. get #rows estimate for it, it has to call this function instead.
  1156. This function is expected to make different actions depending on the nature
  1157. of the table. At the moment there is only one kind of delayed tables,
  1158. non-flattenable semi-joins.
  1159. */
  1160. void get_delayed_table_estimates(TABLE *table,
  1161. ha_rows *out_rows,
  1162. double *scan_time,
  1163. double *startup_cost)
  1164. {
  1165. Item_in_subselect *item= table->pos_in_table_list->jtbm_subselect;
  1166. DBUG_ASSERT(item->engine->engine_type() ==
  1167. subselect_engine::HASH_SJ_ENGINE);
  1168. subselect_hash_sj_engine *hash_sj_engine=
  1169. ((subselect_hash_sj_engine*)item->engine);
  1170. *out_rows= (ha_rows)item->jtbm_record_count;
  1171. *startup_cost= item->jtbm_read_time;
  1172. /* Calculate cost of scanning the temptable */
  1173. double data_size= COST_MULT(item->jtbm_record_count,
  1174. hash_sj_engine->tmp_table->s->reclength);
  1175. /* Do like in handler::read_time */
  1176. *scan_time= data_size/IO_SIZE + 2;
  1177. }
  1178. /**
  1179. @brief Replaces an expression destructively inside the expression tree of
  1180. the WHERE clase.
  1181. @note We substitute AND/OR structure because it was copied by
  1182. copy_andor_structure and some changes could be done in the copy but
  1183. should be left permanent, also there could be several layers of AND over
  1184. AND and OR over OR because ::fix_field() possibly is not called.
  1185. @param join The top-level query.
  1186. @param old_cond The expression to be replaced.
  1187. @param new_cond The expression to be substituted.
  1188. @param do_fix_fields If true, Item::fix_fields(THD*, Item**) is called for
  1189. the new expression.
  1190. @return <code>true</code> if there was an error, <code>false</code> if
  1191. successful.
  1192. */
  1193. static bool replace_where_subcondition(JOIN *join, Item **expr,
  1194. Item *old_cond, Item *new_cond,
  1195. bool do_fix_fields)
  1196. {
  1197. if (*expr == old_cond)
  1198. {
  1199. *expr= new_cond;
  1200. if (do_fix_fields)
  1201. new_cond->fix_fields(join->thd, expr);
  1202. return FALSE;
  1203. }
  1204. if ((*expr)->type() == Item::COND_ITEM)
  1205. {
  1206. List_iterator<Item> li(*((Item_cond*)(*expr))->argument_list());
  1207. Item *item;
  1208. while ((item= li++))
  1209. {
  1210. if (item == old_cond)
  1211. {
  1212. li.replace(new_cond);
  1213. if (do_fix_fields)
  1214. new_cond->fix_fields(join->thd, li.ref());
  1215. return FALSE;
  1216. }
  1217. else if (item->type() == Item::COND_ITEM)
  1218. {
  1219. replace_where_subcondition(join, li.ref(),
  1220. old_cond, new_cond,
  1221. do_fix_fields);
  1222. }
  1223. }
  1224. }
  1225. /*
  1226. We can come to here when
  1227. - we're doing replace operations on both on_expr and prep_on_expr
  1228. - on_expr is the same as prep_on_expr, or they share a sub-tree
  1229. (so, when we do replace in on_expr, we replace in prep_on_expr, too,
  1230. and when we try doing a replace in prep_on_expr, the item we wanted
  1231. to replace there has already been replaced)
  1232. */
  1233. return FALSE;
  1234. }
  1235. static int subq_sj_candidate_cmp(Item_in_subselect* el1, Item_in_subselect* el2,
  1236. void *arg)
  1237. {
  1238. return (el1->sj_convert_priority > el2->sj_convert_priority) ? -1 :
  1239. ( (el1->sj_convert_priority == el2->sj_convert_priority)? 0 : 1);
  1240. }
  1241. /**
  1242. @brief
  1243. reset the value of the field in_eqaulity_no for all Item_func_eq
  1244. items in the where clause of the subquery.
  1245. Look for in_equality_no description in Item_func_eq class
  1246. DESCRIPTION
  1247. Lets have an example:
  1248. SELECT t1.a FROM t1 WHERE t1.a IN
  1249. (SELECT t2.a FROM t2 where t2.b IN
  1250. (select t3.b from t3 where t3.c=27 ))
  1251. So for such a query we have the parent, child and
  1252. grandchild select.
  1253. So for the equality t2.b = t3.b we set the value for in_equality_no to
  1254. 0 according to its description. Wewe do the same for t1.a = t2.a.
  1255. But when we look at the child select (with the grandchild select merged),
  1256. the query would be
  1257. SELECT t1.a FROM t1 WHERE t1.a IN
  1258. (SELECT t2.a FROM t2 where t2.b = t3.b and t3.c=27)
  1259. and then when the child select is merged into the parent select the query
  1260. would look like
  1261. SELECT t1.a FROM t1, semi-join-nest(t2,t3)
  1262. WHERE t1.a =t2.a and t2.b = t3.b and t3.c=27
  1263. Still we would have in_equality_no set for t2.b = t3.b
  1264. though it does not take part in the semi-join equality for the parent select,
  1265. so we should reset its value to UINT_MAX.
  1266. @param cond WHERE clause of the subquery
  1267. */
  1268. static void reset_equality_number_for_subq_conds(Item * cond)
  1269. {
  1270. if (!cond)
  1271. return;
  1272. if (cond->type() == Item::COND_ITEM)
  1273. {
  1274. List_iterator<Item> li(*((Item_cond*) cond)->argument_list());
  1275. Item *item;
  1276. while ((item=li++))
  1277. {
  1278. if (item->type() == Item::FUNC_ITEM &&
  1279. ((Item_func*)item)->functype()== Item_func::EQ_FUNC)
  1280. ((Item_func_eq*)item)->in_equality_no= UINT_MAX;
  1281. }
  1282. }
  1283. else
  1284. {
  1285. if (cond->type() == Item::FUNC_ITEM &&
  1286. ((Item_func*)cond)->functype()== Item_func::EQ_FUNC)
  1287. ((Item_func_eq*)cond)->in_equality_no= UINT_MAX;
  1288. }
  1289. return;
  1290. }
  1291. /*
  1292. Convert a subquery predicate into a TABLE_LIST semi-join nest
  1293. SYNOPSIS
  1294. convert_subq_to_sj()
  1295. parent_join Parent join, the one that has subq_pred in its WHERE/ON
  1296. clause
  1297. subq_pred Subquery predicate to be converted
  1298. DESCRIPTION
  1299. Convert a subquery predicate into a TABLE_LIST semi-join nest. All the
  1300. prerequisites are already checked, so the conversion is always successfull.
  1301. Prepared Statements: the transformation is permanent:
  1302. - Changes in TABLE_LIST structures are naturally permanent
  1303. - Item tree changes are performed on statement MEM_ROOT:
  1304. = we activate statement MEM_ROOT
  1305. = this function is called before the first fix_prepare_information
  1306. call.
  1307. This is intended because the criteria for subquery-to-sj conversion remain
  1308. constant for the lifetime of the Prepared Statement.
  1309. RETURN
  1310. FALSE OK
  1311. TRUE Out of memory error
  1312. */
  1313. static bool convert_subq_to_sj(JOIN *parent_join, Item_in_subselect *subq_pred)
  1314. {
  1315. SELECT_LEX *parent_lex= parent_join->select_lex;
  1316. TABLE_LIST *emb_tbl_nest= NULL;
  1317. List<TABLE_LIST> *emb_join_list= &parent_lex->top_join_list;
  1318. THD *thd= parent_join->thd;
  1319. DBUG_ENTER("convert_subq_to_sj");
  1320. /*
  1321. 1. Find out where to put the predicate into.
  1322. Note: for "t1 LEFT JOIN t2" this will be t2, a leaf.
  1323. */
  1324. if ((void*)subq_pred->emb_on_expr_nest != (void*)NO_JOIN_NEST)
  1325. {
  1326. if (subq_pred->emb_on_expr_nest->nested_join)
  1327. {
  1328. /*
  1329. We're dealing with
  1330. ... [LEFT] JOIN ( ... ) ON (subquery AND whatever) ...
  1331. The sj-nest will be inserted into the brackets nest.
  1332. */
  1333. emb_tbl_nest= subq_pred->emb_on_expr_nest;
  1334. emb_join_list= &emb_tbl_nest->nested_join->join_list;
  1335. }
  1336. else if (!subq_pred->emb_on_expr_nest->outer_join)
  1337. {
  1338. /*
  1339. We're dealing with
  1340. ... INNER JOIN tblX ON (subquery AND whatever) ...
  1341. The sj-nest will be tblX's "sibling", i.e. another child of its
  1342. parent. This is ok because tblX is joined as an inner join.
  1343. */
  1344. emb_tbl_nest= subq_pred->emb_on_expr_nest->embedding;
  1345. if (emb_tbl_nest)
  1346. emb_join_list= &emb_tbl_nest->nested_join->join_list;
  1347. }
  1348. else if (!subq_pred->emb_on_expr_nest->nested_join)
  1349. {
  1350. TABLE_LIST *outer_tbl= subq_pred->emb_on_expr_nest;
  1351. TABLE_LIST *wrap_nest;
  1352. LEX_CSTRING sj_wrap_name= { STRING_WITH_LEN("(sj-wrap)") };
  1353. /*
  1354. We're dealing with
  1355. ... LEFT JOIN tbl ON (on_expr AND subq_pred) ...
  1356. we'll need to convert it into:
  1357. ... LEFT JOIN ( tbl SJ (subq_tables) ) ON (on_expr AND subq_pred) ...
  1358. | |
  1359. |<----- wrap_nest ---->|
  1360. Q: other subqueries may be pointing to this element. What to do?
  1361. A1: simple solution: copy *subq_pred->expr_join_nest= *parent_nest.
  1362. But we'll need to fix other pointers.
  1363. A2: Another way: have TABLE_LIST::next_ptr so the following
  1364. subqueries know the table has been nested.
  1365. A3: changes in the TABLE_LIST::outer_join will make everything work
  1366. automatically.
  1367. */
  1368. if (!(wrap_nest= alloc_join_nest(thd)))
  1369. {
  1370. DBUG_RETURN(TRUE);
  1371. }
  1372. wrap_nest->embedding= outer_tbl->embedding;
  1373. wrap_nest->join_list= outer_tbl->join_list;
  1374. wrap_nest->alias= sj_wrap_name;
  1375. wrap_nest->nested_join->join_list.empty();
  1376. wrap_nest->nested_join->join_list.push_back(outer_tbl, thd->mem_root);
  1377. outer_tbl->embedding= wrap_nest;
  1378. outer_tbl->join_list= &wrap_nest->nested_join->join_list;
  1379. /*
  1380. wrap_nest will take place of outer_tbl, so move the outer join flag
  1381. and on_expr
  1382. */
  1383. wrap_nest->outer_join= outer_tbl->outer_join;
  1384. outer_tbl->outer_join= 0;
  1385. wrap_nest->on_expr= outer_tbl->on_expr;
  1386. outer_tbl->on_expr= NULL;
  1387. List_iterator<TABLE_LIST> li(*wrap_nest->join_list);
  1388. TABLE_LIST *tbl;
  1389. while ((tbl= li++))
  1390. {
  1391. if (tbl == outer_tbl)
  1392. {
  1393. li.replace(wrap_nest);
  1394. break;
  1395. }
  1396. }
  1397. /*
  1398. Ok now wrap_nest 'contains' outer_tbl and we're ready to add the
  1399. semi-join nest into it
  1400. */
  1401. emb_join_list= &wrap_nest->nested_join->join_list;
  1402. emb_tbl_nest= wrap_nest;
  1403. }
  1404. }
  1405. TABLE_LIST *sj_nest;
  1406. NESTED_JOIN *nested_join;
  1407. LEX_CSTRING sj_nest_name= { STRING_WITH_LEN("(sj-nest)") };
  1408. if (!(sj_nest= alloc_join_nest(thd)))
  1409. {
  1410. DBUG_RETURN(TRUE);
  1411. }
  1412. nested_join= sj_nest->nested_join;
  1413. sj_nest->join_list= emb_join_list;
  1414. sj_nest->embedding= emb_tbl_nest;
  1415. sj_nest->alias= sj_nest_name;
  1416. sj_nest->sj_subq_pred= subq_pred;
  1417. sj_nest->original_subq_pred_used_tables= subq_pred->used_tables() |
  1418. subq_pred->left_expr->used_tables();
  1419. /* Nests do not participate in those 'chains', so: */
  1420. /* sj_nest->next_leaf= sj_nest->next_local= sj_nest->next_global == NULL*/
  1421. emb_join_list->push_back(sj_nest, thd->mem_root);
  1422. /*
  1423. nested_join->used_tables and nested_join->not_null_tables are
  1424. initialized in simplify_joins().
  1425. */
  1426. /*
  1427. 2. Walk through subquery's top list and set 'embedding' to point to the
  1428. sj-nest.
  1429. */
  1430. st_select_lex *subq_lex= subq_pred->unit->first_select();
  1431. DBUG_ASSERT(subq_lex->next_select() == NULL);
  1432. nested_join->join_list.empty();
  1433. List_iterator_fast<TABLE_LIST> li(subq_lex->top_join_list);
  1434. TABLE_LIST *tl;
  1435. while ((tl= li++))
  1436. {
  1437. tl->embedding= sj_nest;
  1438. tl->join_list= &nested_join->join_list;
  1439. nested_join->join_list.push_back(tl, thd->mem_root);
  1440. }
  1441. /*
  1442. Reconnect the next_leaf chain.
  1443. TODO: Do we have to put subquery's tables at the end of the chain?
  1444. Inserting them at the beginning would be a bit faster.
  1445. NOTE: We actually insert them at the front! That's because the order is
  1446. reversed in this list.
  1447. */
  1448. parent_lex->leaf_tables.append(&subq_lex->leaf_tables);
  1449. if (subq_lex->options & OPTION_SCHEMA_TABLE)
  1450. parent_lex->options |= OPTION_SCHEMA_TABLE;
  1451. /*
  1452. Same as above for next_local chain
  1453. (a theory: a next_local chain always starts with ::leaf_tables
  1454. because view's tables are inserted after the view)
  1455. */
  1456. for (tl= (TABLE_LIST*)(parent_lex->table_list.first); tl->next_local; tl= tl->next_local)
  1457. {}
  1458. tl->next_local= subq_lex->join->tables_list;
  1459. /* A theory: no need to re-connect the next_global chain */
  1460. /* 3. Remove the original subquery predicate from the WHERE/ON */
  1461. // The subqueries were replaced for Item_int(1) earlier
  1462. subq_pred->reset_strategy(SUBS_SEMI_JOIN); // for subsequent executions
  1463. /*TODO: also reset the 'm_with_subquery' there. */
  1464. /* n. Adjust the parent_join->table_count counter */
  1465. uint table_no= parent_join->table_count;
  1466. /* n. Walk through child's tables and adjust table->map */
  1467. List_iterator_fast<TABLE_LIST> si(subq_lex->leaf_tables);
  1468. while ((tl= si++))
  1469. {
  1470. tl->set_tablenr(table_no);
  1471. if (tl->is_jtbm())
  1472. {
  1473. tl->jtbm_table_no= table_no;
  1474. Item *dummy= tl->jtbm_subselect;
  1475. tl->jtbm_subselect->fix_after_pullout(parent_lex, &dummy, true);
  1476. DBUG_ASSERT(dummy == tl->jtbm_subselect);
  1477. }
  1478. SELECT_LEX *old_sl= tl->select_lex;
  1479. tl->select_lex= parent_join->select_lex;
  1480. for (TABLE_LIST *emb= tl->embedding;
  1481. emb && emb->select_lex == old_sl;
  1482. emb= emb->embedding)
  1483. emb->select_lex= parent_join->select_lex;
  1484. table_no++;
  1485. }
  1486. parent_join->table_count += subq_lex->join->table_count;
  1487. //parent_join->table_count += subq_lex->leaf_tables.elements;
  1488. /*
  1489. Put the subquery's WHERE into semi-join's sj_on_expr
  1490. Add the subquery-induced equalities too.
  1491. */
  1492. SELECT_LEX *save_lex= thd->lex->current_select;
  1493. thd->lex->current_select=subq_lex;
  1494. if (subq_pred->left_expr->fix_fields_if_needed(thd, &subq_pred->left_expr))
  1495. DBUG_RETURN(TRUE);
  1496. thd->lex->current_select=save_lex;
  1497. table_map subq_pred_used_tables= subq_pred->used_tables();
  1498. sj_nest->nested_join->sj_corr_tables= subq_pred_used_tables;
  1499. sj_nest->nested_join->sj_depends_on= subq_pred_used_tables |
  1500. subq_pred->left_expr->used_tables();
  1501. sj_nest->sj_on_expr= subq_lex->join->conds;
  1502. /*
  1503. Create the IN-equalities and inject them into semi-join's ON expression.
  1504. Additionally, for LooseScan strategy
  1505. - Record the number of IN-equalities.
  1506. - Create list of pointers to (oe1, ..., ieN). We'll need the list to
  1507. see which of the expressions are bound and which are not (for those
  1508. we'll produce a distinct stream of (ie_i1,...ie_ik).
  1509. (TODO: can we just create a list of pointers and hope the expressions
  1510. will not substitute themselves on fix_fields()? or we need to wrap
  1511. them into Item_direct_view_refs and store pointers to those. The
  1512. pointers to Item_direct_view_refs are guaranteed to be stable as
  1513. Item_direct_view_refs doesn't substitute itself with anything in
  1514. Item_direct_view_ref::fix_fields.
  1515. */
  1516. sj_nest->sj_in_exprs= subq_pred->left_expr->cols();
  1517. sj_nest->nested_join->sj_outer_expr_list.empty();
  1518. reset_equality_number_for_subq_conds(sj_nest->sj_on_expr);
  1519. if (subq_pred->left_expr->cols() == 1)
  1520. {
  1521. /* add left = select_list_element */
  1522. nested_join->sj_outer_expr_list.push_back(&subq_pred->left_expr,
  1523. thd->mem_root);
  1524. /*
  1525. Create Item_func_eq. Note that
  1526. 1. this is done on the statement, not execution, arena
  1527. 2. if it's a PS then this happens only once - on the first execution.
  1528. On following re-executions, the item will be fix_field-ed normally.
  1529. 3. Thus it should be created as if it was fix_field'ed, in particular
  1530. all pointers to items in the execution arena should be protected
  1531. with thd->change_item_tree
  1532. */
  1533. Item_func_eq *item_eq=
  1534. new (thd->mem_root) Item_func_eq(thd, subq_pred->left_expr_orig,
  1535. subq_lex->ref_pointer_array[0]);
  1536. if (!item_eq)
  1537. DBUG_RETURN(TRUE);
  1538. if (subq_pred->left_expr_orig != subq_pred->left_expr)
  1539. thd->change_item_tree(item_eq->arguments(), subq_pred->left_expr);
  1540. item_eq->in_equality_no= 0;
  1541. sj_nest->sj_on_expr= and_items(thd, sj_nest->sj_on_expr, item_eq);
  1542. }
  1543. else if (subq_pred->left_expr->type() == Item::ROW_ITEM)
  1544. {
  1545. /*
  1546. disassemple left expression and add
  1547. left1 = select_list_element1 and left2 = select_list_element2 ...
  1548. */
  1549. for (uint i= 0; i < subq_pred->left_expr->cols(); i++)
  1550. {
  1551. nested_join->sj_outer_expr_list.push_back(subq_pred->left_expr->addr(i),
  1552. thd->mem_root);
  1553. Item_func_eq *item_eq=
  1554. new (thd->mem_root)
  1555. Item_func_eq(thd, subq_pred->left_expr_orig->element_index(i),
  1556. subq_lex->ref_pointer_array[i]);
  1557. if (!item_eq)
  1558. DBUG_RETURN(TRUE);
  1559. DBUG_ASSERT(subq_pred->left_expr->element_index(i)->is_fixed());
  1560. if (subq_pred->left_expr_orig->element_index(i) !=
  1561. subq_pred->left_expr->element_index(i))
  1562. thd->change_item_tree(item_eq->arguments(),
  1563. subq_pred->left_expr->element_index(i));
  1564. item_eq->in_equality_no= i;
  1565. sj_nest->sj_on_expr= and_items(thd, sj_nest->sj_on_expr, item_eq);
  1566. }
  1567. }
  1568. else
  1569. {
  1570. /*
  1571. add row operation
  1572. left = (select_list_element1, select_list_element2, ...)
  1573. */
  1574. Item_row *row= new (thd->mem_root) Item_row(thd, subq_lex->pre_fix);
  1575. /* fix fields on subquery was call so they should be the same */
  1576. if (!row)
  1577. DBUG_RETURN(TRUE);
  1578. DBUG_ASSERT(subq_pred->left_expr->cols() == row->cols());
  1579. nested_join->sj_outer_expr_list.push_back(&subq_pred->left_expr);
  1580. Item_func_eq *item_eq=
  1581. new (thd->mem_root) Item_func_eq(thd, subq_pred->left_expr_orig, row);
  1582. if (!item_eq)
  1583. DBUG_RETURN(TRUE);
  1584. for (uint i= 0; i < row->cols(); i++)
  1585. {
  1586. if (row->element_index(i) != subq_lex->ref_pointer_array[i])
  1587. thd->change_item_tree(row->addr(i), subq_lex->ref_pointer_array[i]);
  1588. }
  1589. item_eq->in_equality_no= 0;
  1590. sj_nest->sj_on_expr= and_items(thd, sj_nest->sj_on_expr, item_eq);
  1591. }
  1592. /*
  1593. Fix the created equality and AND
  1594. Note that fix_fields() can actually fail in a meaningful way here. One
  1595. example is when the IN-equality is not valid, because it compares columns
  1596. with incompatible collations. (One can argue it would be more appropriate
  1597. to check for this at name resolution stage, but as a legacy of IN->EXISTS
  1598. we have in here).
  1599. */
  1600. if (sj_nest->sj_on_expr->fix_fields_if_needed(thd, &sj_nest->sj_on_expr))
  1601. {
  1602. DBUG_RETURN(TRUE);
  1603. }
  1604. /*
  1605. Walk through sj nest's WHERE and ON expressions and call
  1606. item->fix_table_changes() for all items.
  1607. */
  1608. sj_nest->sj_on_expr->fix_after_pullout(parent_lex, &sj_nest->sj_on_expr,
  1609. TRUE);
  1610. fix_list_after_tbl_changes(parent_lex, &sj_nest->nested_join->join_list);
  1611. /* Unlink the child select_lex so it doesn't show up in EXPLAIN: */
  1612. subq_lex->master_unit()->exclude_level();
  1613. DBUG_EXECUTE("where",
  1614. print_where(sj_nest->sj_on_expr,"SJ-EXPR", QT_ORDINARY););
  1615. /* Inject sj_on_expr into the parent's WHERE or ON */
  1616. if (emb_tbl_nest)
  1617. {
  1618. emb_tbl_nest->on_expr= and_items(thd, emb_tbl_nest->on_expr,
  1619. sj_nest->sj_on_expr);
  1620. emb_tbl_nest->on_expr->top_level_item();
  1621. if (emb_tbl_nest->on_expr->fix_fields_if_needed(thd,
  1622. &emb_tbl_nest->on_expr))
  1623. {
  1624. DBUG_RETURN(TRUE);
  1625. }
  1626. }
  1627. else
  1628. {
  1629. /* Inject into the WHERE */
  1630. parent_join->conds= and_items(thd, parent_join->conds, sj_nest->sj_on_expr);
  1631. parent_join->conds->top_level_item();
  1632. /*
  1633. fix_fields must update the properties (e.g. st_select_lex::cond_count of
  1634. the correct select_lex.
  1635. */
  1636. save_lex= thd->lex->current_select;
  1637. thd->lex->current_select=parent_join->select_lex;
  1638. if (parent_join->conds->fix_fields_if_needed(thd, &parent_join->conds))
  1639. {
  1640. DBUG_RETURN(1);
  1641. }
  1642. thd->lex->current_select=save_lex;
  1643. parent_join->select_lex->where= parent_join->conds;
  1644. }
  1645. if (subq_lex->ftfunc_list->elements)
  1646. {
  1647. Item_func_match *ifm;
  1648. List_iterator_fast<Item_func_match> li(*(subq_lex->ftfunc_list));
  1649. while ((ifm= li++))
  1650. parent_lex->ftfunc_list->push_front(ifm, thd->mem_root);
  1651. }
  1652. parent_lex->have_merged_subqueries= TRUE;
  1653. /* Fatal error may have been set to by fix_after_pullout() */
  1654. DBUG_RETURN(thd->is_fatal_error);
  1655. }
  1656. const int SUBQERY_TEMPTABLE_NAME_MAX_LEN= 20;
  1657. static void create_subquery_temptable_name(LEX_STRING *str, uint number)
  1658. {
  1659. char *to= str->str;
  1660. DBUG_ASSERT(number < 10000);
  1661. to= strmov(to, "<subquery");
  1662. to= int10_to_str((int) number, to, 10);
  1663. to[0]= '>';
  1664. to[1]= 0;
  1665. str->length= (size_t) (to - str->str)+1;
  1666. }
  1667. /*
  1668. Convert subquery predicate into non-mergeable semi-join nest.
  1669. TODO:
  1670. why does this do IN-EXISTS conversion? Can't we unify it with mergeable
  1671. semi-joins? currently, convert_subq_to_sj() cannot fail to convert (unless
  1672. fatal errors)
  1673. RETURN
  1674. FALSE - Ok
  1675. TRUE - Fatal error
  1676. */
  1677. static bool convert_subq_to_jtbm(JOIN *parent_join,
  1678. Item_in_subselect *subq_pred,
  1679. bool *remove_item)
  1680. {
  1681. SELECT_LEX *parent_lex= parent_join->select_lex;
  1682. List<TABLE_LIST> *emb_join_list= &parent_lex->top_join_list;
  1683. TABLE_LIST *emb_tbl_nest= NULL; // will change when we learn to handle outer joins
  1684. TABLE_LIST *tl;
  1685. bool optimization_delayed= TRUE;
  1686. TABLE_LIST *jtbm;
  1687. LEX_STRING tbl_alias;
  1688. THD *thd= parent_join->thd;
  1689. DBUG_ENTER("convert_subq_to_jtbm");
  1690. subq_pred->set_strategy(SUBS_MATERIALIZATION);
  1691. subq_pred->is_jtbm_merged= TRUE;
  1692. *remove_item= TRUE;
  1693. if (!(tbl_alias.str= (char*)thd->calloc(SUBQERY_TEMPTABLE_NAME_MAX_LEN)) ||
  1694. !(jtbm= alloc_join_nest(thd))) //todo: this is not a join nest!
  1695. {
  1696. DBUG_RETURN(TRUE);
  1697. }
  1698. jtbm->join_list= emb_join_list;
  1699. jtbm->embedding= emb_tbl_nest;
  1700. jtbm->jtbm_subselect= subq_pred;
  1701. jtbm->nested_join= NULL;
  1702. /* Nests do not participate in those 'chains', so: */
  1703. /* jtbm->next_leaf= jtbm->next_local= jtbm->next_global == NULL*/
  1704. emb_join_list->push_back(jtbm, thd->mem_root);
  1705. /*
  1706. Inject the jtbm table into TABLE_LIST::next_leaf list, so that
  1707. make_join_statistics() and co. can find it.
  1708. */
  1709. parent_lex->leaf_tables.push_back(jtbm, thd->mem_root);
  1710. if (subq_pred->unit->first_select()->options & OPTION_SCHEMA_TABLE)
  1711. parent_lex->options |= OPTION_SCHEMA_TABLE;
  1712. /*
  1713. Same as above for TABLE_LIST::next_local chain
  1714. (a theory: a next_local chain always starts with ::leaf_tables
  1715. because view's tables are inserted after the view)
  1716. */
  1717. for (tl= (TABLE_LIST*)(parent_lex->table_list.first); tl->next_local; tl= tl->next_local)
  1718. {}
  1719. tl->next_local= jtbm;
  1720. /* A theory: no need to re-connect the next_global chain */
  1721. if (optimization_delayed)
  1722. {
  1723. DBUG_ASSERT(parent_join->table_count < MAX_TABLES);
  1724. jtbm->jtbm_table_no= parent_join->table_count;
  1725. create_subquery_temptable_name(&tbl_alias,
  1726. subq_pred->unit->first_select()->select_number);
  1727. jtbm->alias.str= tbl_alias.str;
  1728. jtbm->alias.length= tbl_alias.length;
  1729. parent_join->table_count++;
  1730. DBUG_RETURN(thd->is_fatal_error);
  1731. }
  1732. subselect_hash_sj_engine *hash_sj_engine=
  1733. ((subselect_hash_sj_engine*)subq_pred->engine);
  1734. jtbm->table= hash_sj_engine->tmp_table;
  1735. jtbm->table->tablenr= parent_join->table_count;
  1736. jtbm->table->map= table_map(1) << (parent_join->table_count);
  1737. jtbm->jtbm_table_no= jtbm->table->tablenr;
  1738. parent_join->table_count++;
  1739. DBUG_ASSERT(parent_join->table_count < MAX_TABLES);
  1740. Item *conds= hash_sj_engine->semi_join_conds;
  1741. conds->fix_after_pullout(parent_lex, &conds, TRUE);
  1742. DBUG_EXECUTE("where", print_where(conds,"SJ-EXPR", QT_ORDINARY););
  1743. create_subquery_temptable_name(&tbl_alias, hash_sj_engine->materialize_join->
  1744. select_lex->select_number);
  1745. jtbm->alias.str= tbl_alias.str;
  1746. jtbm->alias.length= tbl_alias.length;
  1747. parent_lex->have_merged_subqueries= TRUE;
  1748. /* Don't unlink the child subselect, as the subquery will be used. */
  1749. DBUG_RETURN(thd->is_fatal_error);
  1750. }
  1751. static TABLE_LIST *alloc_join_nest(THD *thd)
  1752. {
  1753. TABLE_LIST *tbl;
  1754. if (!(tbl= (TABLE_LIST*) thd->calloc(ALIGN_SIZE(sizeof(TABLE_LIST))+
  1755. sizeof(NESTED_JOIN))))
  1756. return NULL;
  1757. tbl->nested_join= (NESTED_JOIN*) ((uchar*)tbl +
  1758. ALIGN_SIZE(sizeof(TABLE_LIST)));
  1759. return tbl;
  1760. }
  1761. /*
  1762. @Note thd->is_fatal_error can be set in case of OOM
  1763. */
  1764. void fix_list_after_tbl_changes(SELECT_LEX *new_parent, List<TABLE_LIST> *tlist)
  1765. {
  1766. List_iterator<TABLE_LIST> it(*tlist);
  1767. TABLE_LIST *table;
  1768. while ((table= it++))
  1769. {
  1770. if (table->on_expr)
  1771. table->on_expr->fix_after_pullout(new_parent, &table->on_expr, TRUE);
  1772. if (table->nested_join)
  1773. fix_list_after_tbl_changes(new_parent, &table->nested_join->join_list);
  1774. }
  1775. }
  1776. static void set_emb_join_nest(List<TABLE_LIST> *tables, TABLE_LIST *emb_sj_nest)
  1777. {
  1778. List_iterator<TABLE_LIST> it(*tables);
  1779. TABLE_LIST *tbl;
  1780. while ((tbl= it++))
  1781. {
  1782. /*
  1783. Note: check for nested_join first.
  1784. derived-merged tables have tbl->table!=NULL &&
  1785. tbl->table->reginfo==NULL.
  1786. */
  1787. if (tbl->nested_join)
  1788. set_emb_join_nest(&tbl->nested_join->join_list, emb_sj_nest);
  1789. else if (tbl->table)
  1790. tbl->table->reginfo.join_tab->emb_sj_nest= emb_sj_nest;
  1791. }
  1792. }
  1793. /*
  1794. Pull tables out of semi-join nests, if possible
  1795. SYNOPSIS
  1796. pull_out_semijoin_tables()
  1797. join The join where to do the semi-join flattening
  1798. DESCRIPTION
  1799. Try to pull tables out of semi-join nests.
  1800. PRECONDITIONS
  1801. When this function is called, the join may have several semi-join nests
  1802. but it is guaranteed that one semi-join nest does not contain another.
  1803. ACTION
  1804. A table can be pulled out of the semi-join nest if
  1805. - It is a constant table, or
  1806. - It is accessed via eq_ref(outer_tables)
  1807. POSTCONDITIONS
  1808. * Tables that were pulled out have JOIN_TAB::emb_sj_nest == NULL
  1809. * Tables that were not pulled out have JOIN_TAB::emb_sj_nest pointing
  1810. to semi-join nest they are in.
  1811. * Semi-join nests' TABLE_LIST::sj_inner_tables is updated accordingly
  1812. This operation is (and should be) performed at each PS execution since
  1813. tables may become/cease to be constant across PS reexecutions.
  1814. NOTE
  1815. Table pullout may make uncorrelated subquery correlated. Consider this
  1816. example:
  1817. ... WHERE oe IN (SELECT it1.primary_key WHERE p(it1, it2) ... )
  1818. here table it1 can be pulled out (we have it1.primary_key=oe which gives
  1819. us functional dependency). Once it1 is pulled out, all references to it1
  1820. from p(it1, it2) become references to outside of the subquery and thus
  1821. make the subquery (i.e. its semi-join nest) correlated.
  1822. Making the subquery (i.e. its semi-join nest) correlated prevents us from
  1823. using Materialization or LooseScan to execute it.
  1824. RETURN
  1825. 0 - OK
  1826. 1 - Out of memory error
  1827. */
  1828. int pull_out_semijoin_tables(JOIN *join)
  1829. {
  1830. TABLE_LIST *sj_nest;
  1831. DBUG_ENTER("pull_out_semijoin_tables");
  1832. List_iterator<TABLE_LIST> sj_list_it(join->select_lex->sj_nests);
  1833. /* Try pulling out of the each of the semi-joins */
  1834. while ((sj_nest= sj_list_it++))
  1835. {
  1836. List_iterator<TABLE_LIST> child_li(sj_nest->nested_join->join_list);
  1837. TABLE_LIST *tbl;
  1838. Json_writer_object trace_wrapper(join->thd);
  1839. Json_writer_object trace(join->thd, "semijoin_table_pullout");
  1840. Json_writer_array trace_arr(join->thd, "pulled_out_tables");
  1841. /*
  1842. Don't do table pull-out for nested joins (if we get nested joins here, it
  1843. means these are outer joins. It is theoretically possible to do pull-out
  1844. for some of the outer tables but we dont support this currently.
  1845. */
  1846. bool have_join_nest_children= FALSE;
  1847. set_emb_join_nest(&sj_nest->nested_join->join_list, sj_nest);
  1848. while ((tbl= child_li++))
  1849. {
  1850. if (tbl->nested_join)
  1851. {
  1852. have_join_nest_children= TRUE;
  1853. break;
  1854. }
  1855. }
  1856. table_map pulled_tables= 0;
  1857. table_map dep_tables= 0;
  1858. if (have_join_nest_children)
  1859. goto skip;
  1860. /*
  1861. Calculate set of tables within this semi-join nest that have
  1862. other dependent tables
  1863. */
  1864. child_li.rewind();
  1865. while ((tbl= child_li++))
  1866. {
  1867. TABLE *const table= tbl->table;
  1868. if (table &&
  1869. (table->reginfo.join_tab->dependent &
  1870. sj_nest->nested_join->used_tables))
  1871. dep_tables|= table->reginfo.join_tab->dependent;
  1872. }
  1873. /* Action #1: Mark the constant tables to be pulled out */
  1874. child_li.rewind();
  1875. while ((tbl= child_li++))
  1876. {
  1877. if (tbl->table)
  1878. {
  1879. tbl->table->reginfo.join_tab->emb_sj_nest= sj_nest;
  1880. #if 0
  1881. /*
  1882. Do not pull out tables because they are constant. This operation has
  1883. a problem:
  1884. - Some constant tables may become/cease to be constant across PS
  1885. re-executions
  1886. - Contrary to our initial assumption, it turned out that table pullout
  1887. operation is not easily undoable.
  1888. The solution is to leave constant tables where they are. This will
  1889. affect only constant tables that are 1-row or empty, tables that are
  1890. constant because they are accessed via eq_ref(const) access will
  1891. still be pulled out as functionally-dependent.
  1892. This will cause us to miss the chance to flatten some of the
  1893. subqueries, but since const tables do not generate many duplicates,
  1894. it really doesn't matter that much whether they were pulled out or
  1895. not.
  1896. All of this was done as fix for BUG#43768.
  1897. */
  1898. if (tbl->table->map & join->const_table_map)
  1899. {
  1900. pulled_tables |= tbl->table->map;
  1901. DBUG_PRINT("info", ("Table %s pulled out (reason: constant)",
  1902. tbl->table->alias));
  1903. }
  1904. #endif
  1905. }
  1906. }
  1907. /*
  1908. Action #2: Find which tables we can pull out based on
  1909. update_ref_and_keys() data. Note that pulling one table out can allow
  1910. us to pull out some other tables too.
  1911. */
  1912. bool pulled_a_table;
  1913. do
  1914. {
  1915. pulled_a_table= FALSE;
  1916. child_li.rewind();
  1917. while ((tbl= child_li++))
  1918. {
  1919. if (tbl->table && !(pulled_tables & tbl->table->map) &&
  1920. !(dep_tables & tbl->table->map))
  1921. {
  1922. if (find_eq_ref_candidate(tbl->table,
  1923. sj_nest->nested_join->used_tables &
  1924. ~pulled_tables))
  1925. {
  1926. pulled_a_table= TRUE;
  1927. pulled_tables |= tbl->table->map;
  1928. DBUG_PRINT("info", ("Table %s pulled out (reason: func dep)",
  1929. tbl->table->alias.c_ptr_safe()));
  1930. trace_arr.add(tbl->table->alias.c_ptr_safe());
  1931. /*
  1932. Pulling a table out of uncorrelated subquery in general makes
  1933. makes it correlated. See the NOTE to this funtion.
  1934. */
  1935. sj_nest->sj_subq_pred->is_correlated= TRUE;
  1936. sj_nest->nested_join->sj_corr_tables|= tbl->table->map;
  1937. sj_nest->nested_join->sj_depends_on|= tbl->table->map;
  1938. }
  1939. }
  1940. }
  1941. } while (pulled_a_table);
  1942. child_li.rewind();
  1943. skip:
  1944. /*
  1945. Action #3: Move the pulled out TABLE_LIST elements to the parents.
  1946. */
  1947. table_map inner_tables= sj_nest->nested_join->used_tables &
  1948. ~pulled_tables;
  1949. /* Record the bitmap of inner tables */
  1950. sj_nest->sj_inner_tables= inner_tables;
  1951. if (pulled_tables)
  1952. {
  1953. List<TABLE_LIST> *upper_join_list= (sj_nest->embedding != NULL)?
  1954. (&sj_nest->embedding->nested_join->join_list):
  1955. (&join->select_lex->top_join_list);
  1956. Query_arena *arena, backup;
  1957. arena= join->thd->activate_stmt_arena_if_needed(&backup);
  1958. while ((tbl= child_li++))
  1959. {
  1960. if (tbl->table)
  1961. {
  1962. if (inner_tables & tbl->table->map)
  1963. {
  1964. /* This table is not pulled out */
  1965. tbl->table->reginfo.join_tab->emb_sj_nest= sj_nest;
  1966. }
  1967. else
  1968. {
  1969. /* This table has been pulled out of the semi-join nest */
  1970. tbl->table->reginfo.join_tab->emb_sj_nest= NULL;
  1971. /*
  1972. Pull the table up in the same way as simplify_joins() does:
  1973. update join_list and embedding pointers but keep next[_local]
  1974. pointers.
  1975. */
  1976. child_li.remove();
  1977. sj_nest->nested_join->used_tables &= ~tbl->table->map;
  1978. upper_join_list->push_back(tbl, join->thd->mem_root);
  1979. tbl->join_list= upper_join_list;
  1980. tbl->embedding= sj_nest->embedding;
  1981. }
  1982. }
  1983. }
  1984. /* Remove the sj-nest itself if we've removed everything from it */
  1985. if (!inner_tables)
  1986. {
  1987. List_iterator<TABLE_LIST> li(*upper_join_list);
  1988. /* Find the sj_nest in the list. */
  1989. while (sj_nest != li++) ;
  1990. li.remove();
  1991. /* Also remove it from the list of SJ-nests: */
  1992. sj_list_it.remove();
  1993. }
  1994. if (arena)
  1995. join->thd->restore_active_arena(arena, &backup);
  1996. }
  1997. }
  1998. DBUG_RETURN(0);
  1999. }
  2000. /*
  2001. Optimize semi-join nests that could be run with sj-materialization
  2002. SYNOPSIS
  2003. optimize_semijoin_nests()
  2004. join The join to optimize semi-join nests for
  2005. all_table_map Bitmap of all tables in the join
  2006. DESCRIPTION
  2007. Optimize each of the semi-join nests that can be run with
  2008. materialization. For each of the nests, we
  2009. - Generate the best join order for this "sub-join" and remember it;
  2010. - Remember the sub-join execution cost (it's part of materialization
  2011. cost);
  2012. - Calculate other costs that will be incurred if we decide
  2013. to use materialization strategy for this semi-join nest.
  2014. All obtained information is saved and will be used by the main join
  2015. optimization pass.
  2016. NOTES
  2017. Because of Join::reoptimize(), this function may be called multiple times.
  2018. RETURN
  2019. FALSE Ok
  2020. TRUE Out of memory error
  2021. */
  2022. bool optimize_semijoin_nests(JOIN *join, table_map all_table_map)
  2023. {
  2024. DBUG_ENTER("optimize_semijoin_nests");
  2025. THD *thd= join->thd;
  2026. List_iterator<TABLE_LIST> sj_list_it(join->select_lex->sj_nests);
  2027. TABLE_LIST *sj_nest;
  2028. if (!join->select_lex->sj_nests.elements)
  2029. DBUG_RETURN(FALSE);
  2030. Json_writer_object wrapper(thd);
  2031. Json_writer_object trace_semijoin_nest(thd,
  2032. "execution_plan_for_potential_materialization");
  2033. Json_writer_array trace_steps_array(thd, "steps");
  2034. while ((sj_nest= sj_list_it++))
  2035. {
  2036. /* semi-join nests with only constant tables are not valid */
  2037. /// DBUG_ASSERT(sj_nest->sj_inner_tables & ~join->const_table_map);
  2038. sj_nest->sj_mat_info= NULL;
  2039. /*
  2040. The statement may have been executed with 'semijoin=on' earlier.
  2041. We need to verify that 'semijoin=on' still holds.
  2042. */
  2043. if (optimizer_flag(join->thd, OPTIMIZER_SWITCH_SEMIJOIN) &&
  2044. optimizer_flag(join->thd, OPTIMIZER_SWITCH_MATERIALIZATION))
  2045. {
  2046. if ((sj_nest->sj_inner_tables & ~join->const_table_map) && /* not everything was pulled out */
  2047. !sj_nest->sj_subq_pred->is_correlated &&
  2048. sj_nest->sj_subq_pred->types_allow_materialization)
  2049. {
  2050. join->emb_sjm_nest= sj_nest;
  2051. if (choose_plan(join, all_table_map &~join->const_table_map))
  2052. DBUG_RETURN(TRUE); /* purecov: inspected */
  2053. /*
  2054. The best plan to run the subquery is now in join->best_positions,
  2055. save it.
  2056. */
  2057. uint n_tables= my_count_bits(sj_nest->sj_inner_tables & ~join->const_table_map);
  2058. SJ_MATERIALIZATION_INFO* sjm;
  2059. if (!(sjm= new SJ_MATERIALIZATION_INFO) ||
  2060. !(sjm->positions= (POSITION*)join->thd->alloc(sizeof(POSITION)*
  2061. n_tables)))
  2062. DBUG_RETURN(TRUE); /* purecov: inspected */
  2063. sjm->tables= n_tables;
  2064. sjm->is_used= FALSE;
  2065. double subjoin_out_rows, subjoin_read_time;
  2066. /*
  2067. join->get_partial_cost_and_fanout(n_tables + join->const_tables,
  2068. table_map(-1),
  2069. &subjoin_read_time,
  2070. &subjoin_out_rows);
  2071. */
  2072. join->get_prefix_cost_and_fanout(n_tables,
  2073. &subjoin_read_time,
  2074. &subjoin_out_rows);
  2075. sjm->materialization_cost.convert_from_cost(subjoin_read_time);
  2076. sjm->rows_with_duplicates= sjm->rows= subjoin_out_rows;
  2077. // Don't use the following list because it has "stale" items. use
  2078. // ref_pointer_array instead:
  2079. //
  2080. //List<Item> &right_expr_list=
  2081. // sj_nest->sj_subq_pred->unit->first_select()->item_list;
  2082. /*
  2083. Adjust output cardinality estimates. If the subquery has form
  2084. ... oe IN (SELECT t1.colX, t2.colY, func(X,Y,Z) )
  2085. then the number of distinct output record combinations has an
  2086. upper bound of product of number of records matching the tables
  2087. that are used by the SELECT clause.
  2088. TODO:
  2089. We can get a more precise estimate if we
  2090. - use rec_per_key cardinality estimates. For simple cases like
  2091. "oe IN (SELECT t.key ...)" it is trivial.
  2092. - Functional dependencies between the tables in the semi-join
  2093. nest (the payoff is probably less here?)
  2094. See also get_post_group_estimate().
  2095. */
  2096. SELECT_LEX *subq_select= sj_nest->sj_subq_pred->unit->first_select();
  2097. {
  2098. for (uint i=0 ; i < join->const_tables + sjm->tables ; i++)
  2099. {
  2100. JOIN_TAB *tab= join->best_positions[i].table;
  2101. join->map2table[tab->table->tablenr]= tab;
  2102. }
  2103. table_map map= 0;
  2104. for (uint i=0; i < subq_select->item_list.elements; i++)
  2105. map|= subq_select->ref_pointer_array[i]->used_tables();
  2106. map= map & ~PSEUDO_TABLE_BITS;
  2107. Table_map_iterator tm_it(map);
  2108. int tableno;
  2109. double rows= 1.0;
  2110. while ((tableno = tm_it.next_bit()) != Table_map_iterator::BITMAP_END)
  2111. rows= COST_MULT(rows,
  2112. join->map2table[tableno]->table->quick_condition_rows);
  2113. sjm->rows= MY_MIN(sjm->rows, rows);
  2114. }
  2115. memcpy((uchar*) sjm->positions,
  2116. (uchar*) (join->best_positions + join->const_tables),
  2117. sizeof(POSITION) * n_tables);
  2118. /*
  2119. Calculate temporary table parameters and usage costs
  2120. */
  2121. uint rowlen= get_tmp_table_rec_length(subq_select->ref_pointer_array,
  2122. subq_select->item_list.elements);
  2123. double lookup_cost= get_tmp_table_lookup_cost(join->thd,
  2124. subjoin_out_rows, rowlen);
  2125. double write_cost= get_tmp_table_write_cost(join->thd,
  2126. subjoin_out_rows, rowlen);
  2127. /*
  2128. Let materialization cost include the cost to write the data into the
  2129. temporary table:
  2130. */
  2131. sjm->materialization_cost.add_io(subjoin_out_rows, write_cost);
  2132. /*
  2133. Set the cost to do a full scan of the temptable (will need this to
  2134. consider doing sjm-scan):
  2135. */
  2136. sjm->scan_cost.reset();
  2137. sjm->scan_cost.add_io(sjm->rows, lookup_cost);
  2138. sjm->lookup_cost.convert_from_cost(lookup_cost);
  2139. sj_nest->sj_mat_info= sjm;
  2140. DBUG_EXECUTE("opt", print_sjm(sjm););
  2141. }
  2142. }
  2143. }
  2144. join->emb_sjm_nest= NULL;
  2145. DBUG_RETURN(FALSE);
  2146. }
  2147. /*
  2148. Get estimated record length for semi-join materialization temptable
  2149. SYNOPSIS
  2150. get_tmp_table_rec_length()
  2151. items IN subquery's select list.
  2152. DESCRIPTION
  2153. Calculate estimated record length for semi-join materialization
  2154. temptable. It's an estimate because we don't follow every bit of
  2155. create_tmp_table()'s logic. This isn't necessary as the return value of
  2156. this function is used only for cost calculations.
  2157. RETURN
  2158. Length of the temptable record, in bytes
  2159. */
  2160. static uint get_tmp_table_rec_length(Ref_ptr_array p_items, uint elements)
  2161. {
  2162. uint len= 0;
  2163. Item *item;
  2164. //List_iterator<Item> it(items);
  2165. for (uint i= 0; i < elements ; i++)
  2166. {
  2167. item = p_items[i];
  2168. switch (item->result_type()) {
  2169. case REAL_RESULT:
  2170. len += sizeof(double);
  2171. break;
  2172. case INT_RESULT:
  2173. if (item->max_length >= (MY_INT32_NUM_DECIMAL_DIGITS - 1))
  2174. len += 8;
  2175. else
  2176. len += 4;
  2177. break;
  2178. case STRING_RESULT:
  2179. enum enum_field_types type;
  2180. /* DATE/TIME and GEOMETRY fields have STRING_RESULT result type. */
  2181. if ((type= item->field_type()) == MYSQL_TYPE_DATETIME ||
  2182. type == MYSQL_TYPE_TIME || type == MYSQL_TYPE_DATE ||
  2183. type == MYSQL_TYPE_TIMESTAMP || type == MYSQL_TYPE_GEOMETRY)
  2184. len += 8;
  2185. else
  2186. len += item->max_length;
  2187. break;
  2188. case DECIMAL_RESULT:
  2189. len += 10;
  2190. break;
  2191. case ROW_RESULT:
  2192. default:
  2193. DBUG_ASSERT(0); /* purecov: deadcode */
  2194. break;
  2195. }
  2196. }
  2197. return len;
  2198. }
  2199. /**
  2200. The cost of a lookup into a unique hash/btree index on a temporary table
  2201. with 'row_count' rows each of size 'row_size'.
  2202. @param thd current query context
  2203. @param row_count number of rows in the temp table
  2204. @param row_size average size in bytes of the rows
  2205. @return the cost of one lookup
  2206. */
  2207. double
  2208. get_tmp_table_lookup_cost(THD *thd, double row_count, uint row_size)
  2209. {
  2210. if (row_count > thd->variables.max_heap_table_size / (double) row_size)
  2211. return (double) DISK_TEMPTABLE_LOOKUP_COST;
  2212. else
  2213. return (double) HEAP_TEMPTABLE_LOOKUP_COST;
  2214. }
  2215. /**
  2216. The cost of writing a row into a temporary table with 'row_count' unique
  2217. rows each of size 'row_size'.
  2218. @param thd current query context
  2219. @param row_count number of rows in the temp table
  2220. @param row_size average size in bytes of the rows
  2221. @return the cost of writing one row
  2222. */
  2223. double
  2224. get_tmp_table_write_cost(THD *thd, double row_count, uint row_size)
  2225. {
  2226. double lookup_cost= get_tmp_table_lookup_cost(thd, row_count, row_size);
  2227. /*
  2228. TODO:
  2229. This is an optimistic estimate. Add additional costs resulting from
  2230. actually writing the row to memory/disk and possible index reorganization.
  2231. */
  2232. return lookup_cost;
  2233. }
  2234. /*
  2235. Check if table's KEYUSE elements have an eq_ref(outer_tables) candidate
  2236. SYNOPSIS
  2237. find_eq_ref_candidate()
  2238. table Table to be checked
  2239. sj_inner_tables Bitmap of inner tables. eq_ref(inner_table) doesn't
  2240. count.
  2241. DESCRIPTION
  2242. Check if table's KEYUSE elements have an eq_ref(outer_tables) candidate
  2243. TODO
  2244. Check again if it is feasible to factor common parts with constant table
  2245. search
  2246. Also check if it's feasible to factor common parts with table elimination
  2247. RETURN
  2248. TRUE - There exists an eq_ref(outer-tables) candidate
  2249. FALSE - Otherwise
  2250. */
  2251. bool find_eq_ref_candidate(TABLE *table, table_map sj_inner_tables)
  2252. {
  2253. KEYUSE *keyuse= table->reginfo.join_tab->keyuse;
  2254. if (keyuse)
  2255. {
  2256. do
  2257. {
  2258. uint key= keyuse->key;
  2259. KEY *keyinfo;
  2260. key_part_map bound_parts= 0;
  2261. bool is_excluded_key= keyuse->is_for_hash_join();
  2262. if (!is_excluded_key)
  2263. {
  2264. keyinfo= table->key_info + key;
  2265. is_excluded_key= !MY_TEST(keyinfo->flags & HA_NOSAME);
  2266. }
  2267. if (!is_excluded_key)
  2268. {
  2269. do /* For all equalities on all key parts */
  2270. {
  2271. /* Check if this is "t.keypart = expr(outer_tables) */
  2272. if (!(keyuse->used_tables & sj_inner_tables) &&
  2273. !(keyuse->optimize & KEY_OPTIMIZE_REF_OR_NULL))
  2274. {
  2275. bound_parts |= 1 << keyuse->keypart;
  2276. }
  2277. keyuse++;
  2278. } while (keyuse->key == key && keyuse->table == table);
  2279. if (bound_parts == PREV_BITS(uint, keyinfo->user_defined_key_parts))
  2280. return TRUE;
  2281. }
  2282. else
  2283. {
  2284. do
  2285. {
  2286. keyuse++;
  2287. } while (keyuse->key == key && keyuse->table == table);
  2288. }
  2289. } while (keyuse->table == table);
  2290. }
  2291. return FALSE;
  2292. }
  2293. /*
  2294. Do semi-join optimization step after we've added a new tab to join prefix
  2295. SYNOPSIS
  2296. advance_sj_state()
  2297. join The join we're optimizing
  2298. remaining_tables Tables not in the join prefix
  2299. new_join_tab Join tab we've just added to the join prefix
  2300. idx Index of this join tab (i.e. number of tables
  2301. in the prefix minus one)
  2302. current_record_count INOUT Estimate of #records in join prefix's output
  2303. current_read_time INOUT Cost to execute the join prefix
  2304. loose_scan_pos IN A POSITION with LooseScan plan to access
  2305. table new_join_tab
  2306. (produced by the last best_access_path call)
  2307. DESCRIPTION
  2308. Update semi-join optimization state after we've added another tab (table
  2309. and access method) to the join prefix.
  2310. The state is maintained in join->positions[#prefix_size]. Each of the
  2311. available strategies has its own state variables.
  2312. for each semi-join strategy
  2313. {
  2314. update strategy's state variables;
  2315. if (join prefix has all the tables that are needed to consider
  2316. using this strategy for the semi-join(s))
  2317. {
  2318. calculate cost of using the strategy
  2319. if ((this is the first strategy to handle the semi-join nest(s) ||
  2320. the cost is less than other strategies))
  2321. {
  2322. // Pick this strategy
  2323. pos->sj_strategy= ..
  2324. ..
  2325. }
  2326. }
  2327. Most of the new state is saved join->positions[idx] (and hence no undo
  2328. is necessary). Several members of class JOIN are updated also, these
  2329. changes can be rolled back with restore_prev_sj_state().
  2330. See setup_semijoin_dups_elimination() for a description of what kinds of
  2331. join prefixes each strategy can handle.
  2332. */
  2333. bool is_multiple_semi_joins(JOIN *join, POSITION *prefix, uint idx, table_map inner_tables)
  2334. {
  2335. for (int i= (int)idx; i >= 0; i--)
  2336. {
  2337. TABLE_LIST *emb_sj_nest;
  2338. if ((emb_sj_nest= prefix[i].table->emb_sj_nest))
  2339. {
  2340. if (inner_tables & emb_sj_nest->sj_inner_tables)
  2341. return !MY_TEST(inner_tables == (emb_sj_nest->sj_inner_tables &
  2342. ~join->const_table_map));
  2343. }
  2344. }
  2345. return FALSE;
  2346. }
  2347. void advance_sj_state(JOIN *join, table_map remaining_tables, uint idx,
  2348. double *current_record_count, double *current_read_time,
  2349. POSITION *loose_scan_pos)
  2350. {
  2351. POSITION *pos= join->positions + idx;
  2352. const JOIN_TAB *new_join_tab= pos->table;
  2353. if (join->emb_sjm_nest || //(1)
  2354. !join->select_lex->have_merged_subqueries) //(2)
  2355. {
  2356. /*
  2357. (1): We're performing optimization inside SJ-Materialization nest:
  2358. - there are no other semi-joins inside semi-join nests
  2359. - attempts to build semi-join strategies here will confuse
  2360. the optimizer, so bail out.
  2361. (2): Don't waste time on semi-join optimizations if we don't have any
  2362. semi-joins
  2363. */
  2364. pos->sj_strategy= SJ_OPT_NONE;
  2365. return;
  2366. }
  2367. Semi_join_strategy_picker *pickers[]=
  2368. {
  2369. &pos->firstmatch_picker,
  2370. &pos->loosescan_picker,
  2371. &pos->sjmat_picker,
  2372. &pos->dups_weedout_picker,
  2373. NULL,
  2374. };
  2375. Json_writer_array trace_steps(join->thd, "semijoin_strategy_choice");
  2376. /*
  2377. Update join->cur_sj_inner_tables (Used by FirstMatch in this function and
  2378. LooseScan detector in best_access_path)
  2379. */
  2380. remaining_tables &= ~new_join_tab->table->map;
  2381. table_map dups_producing_tables, UNINIT_VAR(prev_dups_producing_tables),
  2382. UNINIT_VAR(prev_sjm_lookup_tables);
  2383. if (idx == join->const_tables)
  2384. dups_producing_tables= 0;
  2385. else
  2386. dups_producing_tables= pos[-1].dups_producing_tables;
  2387. TABLE_LIST *emb_sj_nest;
  2388. if ((emb_sj_nest= new_join_tab->emb_sj_nest))
  2389. dups_producing_tables |= emb_sj_nest->sj_inner_tables;
  2390. Semi_join_strategy_picker **strategy, **prev_strategy= 0;
  2391. if (idx == join->const_tables)
  2392. {
  2393. /* First table, initialize pickers */
  2394. for (strategy= pickers; *strategy != NULL; strategy++)
  2395. (*strategy)->set_empty();
  2396. pos->inner_tables_handled_with_other_sjs= 0;
  2397. }
  2398. else
  2399. {
  2400. for (strategy= pickers; *strategy != NULL; strategy++)
  2401. {
  2402. (*strategy)->set_from_prev(pos - 1);
  2403. }
  2404. pos->inner_tables_handled_with_other_sjs=
  2405. pos[-1].inner_tables_handled_with_other_sjs;
  2406. }
  2407. pos->prefix_cost.convert_from_cost(*current_read_time);
  2408. pos->prefix_record_count= *current_record_count;
  2409. {
  2410. pos->sj_strategy= SJ_OPT_NONE;
  2411. for (strategy= pickers; *strategy != NULL; strategy++)
  2412. {
  2413. table_map handled_fanout;
  2414. sj_strategy_enum sj_strategy;
  2415. double rec_count= *current_record_count;
  2416. double read_time= *current_read_time;
  2417. if ((*strategy)->check_qep(join, idx, remaining_tables,
  2418. new_join_tab,
  2419. &rec_count,
  2420. &read_time,
  2421. &handled_fanout,
  2422. &sj_strategy,
  2423. loose_scan_pos))
  2424. {
  2425. /*
  2426. It's possible to use the strategy. Use it, if
  2427. - it removes semi-join fanout that was not removed before
  2428. - using it is cheaper than using something else,
  2429. and {if some other strategy has removed fanout
  2430. that this strategy is trying to remove, then it
  2431. did remove the fanout only for one semi-join}
  2432. This is to avoid a situation when
  2433. 1. strategy X removes fanout for semijoin X,Y
  2434. 2. using strategy Z is cheaper, but it only removes
  2435. fanout from semijoin X.
  2436. 3. We have no clue what to do about fanount of semi-join Y.
  2437. */
  2438. if ((dups_producing_tables & handled_fanout) ||
  2439. (read_time < *current_read_time &&
  2440. !(handled_fanout & pos->inner_tables_handled_with_other_sjs)))
  2441. {
  2442. DBUG_ASSERT(pos->sj_strategy != sj_strategy);
  2443. /*
  2444. If the strategy choosen first time or
  2445. the strategy replace strategy which was used to exectly the same
  2446. tables
  2447. */
  2448. if (pos->sj_strategy == SJ_OPT_NONE ||
  2449. handled_fanout ==
  2450. (prev_dups_producing_tables ^ dups_producing_tables))
  2451. {
  2452. prev_strategy= strategy;
  2453. if (pos->sj_strategy == SJ_OPT_NONE)
  2454. {
  2455. prev_dups_producing_tables= dups_producing_tables;
  2456. prev_sjm_lookup_tables= join->sjm_lookup_tables;
  2457. }
  2458. /* Mark strategy as used */
  2459. (*strategy)->mark_used();
  2460. pos->sj_strategy= sj_strategy;
  2461. if (sj_strategy == SJ_OPT_MATERIALIZE)
  2462. join->sjm_lookup_tables |= handled_fanout;
  2463. else
  2464. join->sjm_lookup_tables &= ~handled_fanout;
  2465. *current_read_time= read_time;
  2466. *current_record_count= rec_count;
  2467. dups_producing_tables &= ~handled_fanout;
  2468. //TODO: update bitmap of semi-joins that were handled together with
  2469. // others.
  2470. if (is_multiple_semi_joins(join, join->positions, idx,
  2471. handled_fanout))
  2472. pos->inner_tables_handled_with_other_sjs |= handled_fanout;
  2473. }
  2474. else
  2475. {
  2476. /* Conflict fall to most general variant */
  2477. (*prev_strategy)->set_empty();
  2478. dups_producing_tables= prev_dups_producing_tables;
  2479. join->sjm_lookup_tables= prev_sjm_lookup_tables;
  2480. // mark it 'none' to avpoid loops
  2481. pos->sj_strategy= SJ_OPT_NONE;
  2482. // next skip to last;
  2483. strategy= pickers +
  2484. (sizeof(pickers)/sizeof(Semi_join_strategy_picker*) - 3);
  2485. continue;
  2486. }
  2487. }
  2488. else
  2489. {
  2490. /* We decided not to apply the strategy. */
  2491. (*strategy)->set_empty();
  2492. }
  2493. }
  2494. }
  2495. if (unlikely(join->thd->trace_started() && pos->sj_strategy != SJ_OPT_NONE))
  2496. {
  2497. Json_writer_object tr(join->thd);
  2498. const char *sname;
  2499. switch (pos->sj_strategy) {
  2500. case SJ_OPT_MATERIALIZE:
  2501. sname= "SJ-Materialize";
  2502. break;
  2503. case SJ_OPT_MATERIALIZE_SCAN:
  2504. sname= "SJ-Materialize-Scan";
  2505. break;
  2506. case SJ_OPT_FIRST_MATCH:
  2507. sname= "FirstMatch";
  2508. break;
  2509. case SJ_OPT_DUPS_WEEDOUT:
  2510. sname= "DuplicateWeedout";
  2511. break;
  2512. case SJ_OPT_LOOSE_SCAN:
  2513. sname= "LooseScan";
  2514. break;
  2515. default:
  2516. DBUG_ASSERT(0);
  2517. sname="Invalid";
  2518. }
  2519. tr.add("chosen_strategy", sname);
  2520. }
  2521. }
  2522. if ((emb_sj_nest= new_join_tab->emb_sj_nest))
  2523. {
  2524. join->cur_sj_inner_tables |= emb_sj_nest->sj_inner_tables;
  2525. /* Remove the sj_nest if all of its SJ-inner tables are in cur_table_map */
  2526. if (!(remaining_tables &
  2527. emb_sj_nest->sj_inner_tables & ~new_join_tab->table->map))
  2528. join->cur_sj_inner_tables &= ~emb_sj_nest->sj_inner_tables;
  2529. }
  2530. pos->prefix_cost.convert_from_cost(*current_read_time);
  2531. pos->prefix_record_count= *current_record_count;
  2532. pos->dups_producing_tables= dups_producing_tables;
  2533. }
  2534. void Sj_materialization_picker::set_from_prev(struct st_position *prev)
  2535. {
  2536. if (prev->sjmat_picker.is_used)
  2537. set_empty();
  2538. else
  2539. {
  2540. sjm_scan_need_tables= prev->sjmat_picker.sjm_scan_need_tables;
  2541. sjm_scan_last_inner= prev->sjmat_picker.sjm_scan_last_inner;
  2542. }
  2543. is_used= FALSE;
  2544. }
  2545. bool Sj_materialization_picker::check_qep(JOIN *join,
  2546. uint idx,
  2547. table_map remaining_tables,
  2548. const JOIN_TAB *new_join_tab,
  2549. double *record_count,
  2550. double *read_time,
  2551. table_map *handled_fanout,
  2552. sj_strategy_enum *strategy,
  2553. POSITION *loose_scan_pos)
  2554. {
  2555. bool sjm_scan;
  2556. SJ_MATERIALIZATION_INFO *mat_info;
  2557. THD *thd= join->thd;
  2558. if ((mat_info= at_sjmat_pos(join, remaining_tables,
  2559. new_join_tab, idx, &sjm_scan)))
  2560. {
  2561. if (sjm_scan)
  2562. {
  2563. /*
  2564. We can't yet evaluate this option yet. This is because we can't
  2565. accout for fanout of sj-inner tables yet:
  2566. ntX SJM-SCAN(it1 ... itN) | ot1 ... otN |
  2567. ^(1) ^(2)
  2568. we're now at position (1). SJM temptable in general has multiple
  2569. records, so at point (1) we'll get the fanout from sj-inner tables (ie
  2570. there will be multiple record combinations).
  2571. The final join result will not contain any semi-join produced
  2572. fanout, i.e. tables within SJM-SCAN(...) will not contribute to
  2573. the cardinality of the join output. Extra fanout produced by
  2574. SJM-SCAN(...) will be 'absorbed' into fanout produced by ot1 ... otN.
  2575. The simple way to model this is to remove SJM-SCAN(...) fanout once
  2576. we reach the point #2.
  2577. */
  2578. sjm_scan_need_tables=
  2579. new_join_tab->emb_sj_nest->sj_inner_tables |
  2580. new_join_tab->emb_sj_nest->nested_join->sj_depends_on |
  2581. new_join_tab->emb_sj_nest->nested_join->sj_corr_tables;
  2582. sjm_scan_last_inner= idx;
  2583. }
  2584. else
  2585. {
  2586. Json_writer_object trace(join->thd);
  2587. trace.add("strategy", "SJ-Materialization");
  2588. /* This is SJ-Materialization with lookups */
  2589. Cost_estimate prefix_cost;
  2590. signed int first_tab= (int)idx - mat_info->tables;
  2591. double prefix_rec_count;
  2592. if (first_tab < (int)join->const_tables)
  2593. {
  2594. prefix_cost.reset();
  2595. prefix_rec_count= 1.0;
  2596. }
  2597. else
  2598. {
  2599. prefix_cost= join->positions[first_tab].prefix_cost;
  2600. prefix_rec_count= join->positions[first_tab].prefix_record_count;
  2601. }
  2602. double mat_read_time= prefix_cost.total_cost();
  2603. mat_read_time=
  2604. COST_ADD(mat_read_time,
  2605. COST_ADD(mat_info->materialization_cost.total_cost(),
  2606. COST_MULT(prefix_rec_count,
  2607. mat_info->lookup_cost.total_cost())));
  2608. /*
  2609. NOTE: When we pick to use SJM[-Scan] we don't memcpy its POSITION
  2610. elements to join->positions as that makes it hard to return things
  2611. back when making one step back in join optimization. That's done
  2612. after the QEP has been chosen.
  2613. */
  2614. *read_time= mat_read_time;
  2615. *record_count= prefix_rec_count;
  2616. *handled_fanout= new_join_tab->emb_sj_nest->sj_inner_tables;
  2617. *strategy= SJ_OPT_MATERIALIZE;
  2618. if (unlikely(join->thd->trace_started()))
  2619. {
  2620. trace.add("records", *record_count);
  2621. trace.add("read_time", *read_time);
  2622. }
  2623. return TRUE;
  2624. }
  2625. }
  2626. /* 4.A SJM-Scan second phase check */
  2627. if (sjm_scan_need_tables && /* Have SJM-Scan prefix */
  2628. !(sjm_scan_need_tables & remaining_tables))
  2629. {
  2630. Json_writer_object trace(join->thd);
  2631. trace.add("strategy", "SJ-Materialization-Scan");
  2632. TABLE_LIST *mat_nest=
  2633. join->positions[sjm_scan_last_inner].table->emb_sj_nest;
  2634. SJ_MATERIALIZATION_INFO *mat_info= mat_nest->sj_mat_info;
  2635. double prefix_cost;
  2636. double prefix_rec_count;
  2637. int first_tab= sjm_scan_last_inner + 1 - mat_info->tables;
  2638. /* Get the prefix cost */
  2639. if (first_tab == (int)join->const_tables)
  2640. {
  2641. prefix_rec_count= 1.0;
  2642. prefix_cost= 0.0;
  2643. }
  2644. else
  2645. {
  2646. prefix_cost= join->positions[first_tab - 1].prefix_cost.total_cost();
  2647. prefix_rec_count= join->positions[first_tab - 1].prefix_record_count;
  2648. }
  2649. /* Add materialization cost */
  2650. prefix_cost=
  2651. COST_ADD(prefix_cost,
  2652. COST_ADD(mat_info->materialization_cost.total_cost(),
  2653. COST_MULT(prefix_rec_count,
  2654. mat_info->scan_cost.total_cost())));
  2655. prefix_rec_count= COST_MULT(prefix_rec_count, mat_info->rows);
  2656. uint i;
  2657. table_map rem_tables= remaining_tables;
  2658. for (i= idx; i != (first_tab + mat_info->tables - 1); i--)
  2659. rem_tables |= join->positions[i].table->table->map;
  2660. POSITION curpos, dummy;
  2661. /* Need to re-run best-access-path as we prefix_rec_count has changed */
  2662. bool disable_jbuf= (join->thd->variables.join_cache_level == 0);
  2663. Json_writer_temp_disable trace_semijoin_mat_scan(thd);
  2664. for (i= first_tab + mat_info->tables; i <= idx; i++)
  2665. {
  2666. best_access_path(join, join->positions[i].table, rem_tables, i,
  2667. disable_jbuf, prefix_rec_count, &curpos, &dummy);
  2668. prefix_rec_count= COST_MULT(prefix_rec_count, curpos.records_read);
  2669. prefix_cost= COST_ADD(prefix_cost, curpos.read_time);
  2670. prefix_cost= COST_ADD(prefix_cost,
  2671. prefix_rec_count / (double) TIME_FOR_COMPARE);
  2672. //TODO: take into account join condition selectivity here
  2673. }
  2674. *strategy= SJ_OPT_MATERIALIZE_SCAN;
  2675. *read_time= prefix_cost;
  2676. *record_count= prefix_rec_count / mat_info->rows_with_duplicates;
  2677. *handled_fanout= mat_nest->sj_inner_tables;
  2678. if (unlikely(join->thd->trace_started()))
  2679. {
  2680. trace.add("records", *record_count);
  2681. trace.add("read_time", *read_time);
  2682. }
  2683. return TRUE;
  2684. }
  2685. return FALSE;
  2686. }
  2687. void LooseScan_picker::set_from_prev(struct st_position *prev)
  2688. {
  2689. if (prev->loosescan_picker.is_used)
  2690. set_empty();
  2691. else
  2692. {
  2693. first_loosescan_table= prev->loosescan_picker.first_loosescan_table;
  2694. loosescan_need_tables= prev->loosescan_picker.loosescan_need_tables;
  2695. }
  2696. is_used= FALSE;
  2697. }
  2698. bool LooseScan_picker::check_qep(JOIN *join,
  2699. uint idx,
  2700. table_map remaining_tables,
  2701. const JOIN_TAB *new_join_tab,
  2702. double *record_count,
  2703. double *read_time,
  2704. table_map *handled_fanout,
  2705. sj_strategy_enum *strategy,
  2706. struct st_position *loose_scan_pos)
  2707. {
  2708. POSITION *first= join->positions + first_loosescan_table;
  2709. /*
  2710. LooseScan strategy can't handle interleaving between tables from the
  2711. semi-join that LooseScan is handling and any other tables.
  2712. If we were considering LooseScan for the join prefix (1)
  2713. and the table we're adding creates an interleaving (2)
  2714. then
  2715. stop considering loose scan
  2716. */
  2717. if ((first_loosescan_table != MAX_TABLES) && // (1)
  2718. (first->table->emb_sj_nest->sj_inner_tables & remaining_tables) && //(2)
  2719. new_join_tab->emb_sj_nest != first->table->emb_sj_nest) //(2)
  2720. {
  2721. first_loosescan_table= MAX_TABLES;
  2722. }
  2723. /*
  2724. If we got an option to use LooseScan for the current table, start
  2725. considering using LooseScan strategy
  2726. */
  2727. if (loose_scan_pos->read_time != DBL_MAX && !join->outer_join)
  2728. {
  2729. first_loosescan_table= idx;
  2730. loosescan_need_tables=
  2731. new_join_tab->emb_sj_nest->sj_inner_tables |
  2732. new_join_tab->emb_sj_nest->nested_join->sj_depends_on |
  2733. new_join_tab->emb_sj_nest->nested_join->sj_corr_tables;
  2734. }
  2735. if ((first_loosescan_table != MAX_TABLES) &&
  2736. !(remaining_tables & loosescan_need_tables) &&
  2737. (new_join_tab->table->map & loosescan_need_tables))
  2738. {
  2739. Json_writer_object trace(join->thd);
  2740. trace.add("strategy", "SJ-Materialization-Scan");
  2741. /*
  2742. Ok we have LooseScan plan and also have all LooseScan sj-nest's
  2743. inner tables and outer correlated tables into the prefix.
  2744. */
  2745. first= join->positions + first_loosescan_table;
  2746. uint n_tables= my_count_bits(first->table->emb_sj_nest->sj_inner_tables);
  2747. /* Got a complete LooseScan range. Calculate its cost */
  2748. /*
  2749. The same problem as with FirstMatch - we need to save POSITIONs
  2750. somewhere but reserving space for all cases would require too
  2751. much space. We will re-calculate POSITION structures later on.
  2752. */
  2753. bool disable_jbuf= (join->thd->variables.join_cache_level == 0);
  2754. optimize_wo_join_buffering(join, first_loosescan_table, idx,
  2755. remaining_tables,
  2756. TRUE, //first_alt
  2757. disable_jbuf ? join->table_count :
  2758. first_loosescan_table + n_tables,
  2759. record_count,
  2760. read_time);
  2761. /*
  2762. We don't yet have any other strategies that could handle this
  2763. semi-join nest (the other options are Duplicate Elimination or
  2764. Materialization, which need at least the same set of tables in
  2765. the join prefix to be considered) so unconditionally pick the
  2766. LooseScan.
  2767. */
  2768. *strategy= SJ_OPT_LOOSE_SCAN;
  2769. *handled_fanout= first->table->emb_sj_nest->sj_inner_tables;
  2770. if (unlikely(join->thd->trace_started()))
  2771. {
  2772. trace.add("records", *record_count);
  2773. trace.add("read_time", *read_time);
  2774. }
  2775. return TRUE;
  2776. }
  2777. return FALSE;
  2778. }
  2779. void Firstmatch_picker::set_from_prev(struct st_position *prev)
  2780. {
  2781. if (prev->firstmatch_picker.is_used)
  2782. invalidate_firstmatch_prefix();
  2783. else
  2784. {
  2785. first_firstmatch_table= prev->firstmatch_picker.first_firstmatch_table;
  2786. first_firstmatch_rtbl= prev->firstmatch_picker.first_firstmatch_rtbl;
  2787. firstmatch_need_tables= prev->firstmatch_picker.firstmatch_need_tables;
  2788. }
  2789. is_used= FALSE;
  2790. }
  2791. bool Firstmatch_picker::check_qep(JOIN *join,
  2792. uint idx,
  2793. table_map remaining_tables,
  2794. const JOIN_TAB *new_join_tab,
  2795. double *record_count,
  2796. double *read_time,
  2797. table_map *handled_fanout,
  2798. sj_strategy_enum *strategy,
  2799. POSITION *loose_scan_pos)
  2800. {
  2801. if (new_join_tab->emb_sj_nest &&
  2802. optimizer_flag(join->thd, OPTIMIZER_SWITCH_FIRSTMATCH) &&
  2803. !join->outer_join)
  2804. {
  2805. const table_map outer_corr_tables=
  2806. new_join_tab->emb_sj_nest->nested_join->sj_corr_tables |
  2807. new_join_tab->emb_sj_nest->nested_join->sj_depends_on;
  2808. const table_map sj_inner_tables=
  2809. new_join_tab->emb_sj_nest->sj_inner_tables & ~join->const_table_map;
  2810. /*
  2811. Enter condition:
  2812. 1. The next join tab belongs to semi-join nest
  2813. (verified for the encompassing code block above).
  2814. 2. We're not in a duplicate producer range yet
  2815. 3. All outer tables that
  2816. - the subquery is correlated with, or
  2817. - referred to from the outer_expr
  2818. are in the join prefix
  2819. 4. All inner tables are still part of remaining_tables.
  2820. */
  2821. if (!join->cur_sj_inner_tables && // (2)
  2822. !(remaining_tables & outer_corr_tables) && // (3)
  2823. (sj_inner_tables == // (4)
  2824. ((remaining_tables | new_join_tab->table->map) & sj_inner_tables)))
  2825. {
  2826. /* Start tracking potential FirstMatch range */
  2827. first_firstmatch_table= idx;
  2828. firstmatch_need_tables= sj_inner_tables;
  2829. first_firstmatch_rtbl= remaining_tables;
  2830. }
  2831. if (in_firstmatch_prefix())
  2832. {
  2833. if (outer_corr_tables & first_firstmatch_rtbl)
  2834. {
  2835. /*
  2836. Trying to add an sj-inner table whose sj-nest has an outer correlated
  2837. table that was not in the prefix. This means FirstMatch can't be used.
  2838. */
  2839. invalidate_firstmatch_prefix();
  2840. }
  2841. else
  2842. {
  2843. /* Record that we need all of this semi-join's inner tables, too */
  2844. firstmatch_need_tables|= sj_inner_tables;
  2845. }
  2846. if (in_firstmatch_prefix() &&
  2847. !(firstmatch_need_tables & remaining_tables))
  2848. {
  2849. Json_writer_object trace(join->thd);
  2850. trace.add("strategy", "FirstMatch");
  2851. /*
  2852. Got a complete FirstMatch range. Calculate correct costs and fanout
  2853. */
  2854. if (idx == first_firstmatch_table &&
  2855. optimizer_flag(join->thd, OPTIMIZER_SWITCH_SEMIJOIN_WITH_CACHE))
  2856. {
  2857. /*
  2858. An important special case: only one inner table, and @@optimizer_switch
  2859. allows join buffering.
  2860. - read_time is the same (i.e. FirstMatch doesn't add any cost
  2861. - remove fanout added by the last table
  2862. */
  2863. if (*record_count)
  2864. *record_count /= join->positions[idx].records_read;
  2865. }
  2866. else
  2867. {
  2868. optimize_wo_join_buffering(join, first_firstmatch_table, idx,
  2869. remaining_tables, FALSE, idx,
  2870. record_count,
  2871. read_time);
  2872. }
  2873. /*
  2874. We ought to save the alternate POSITIONs produced by
  2875. optimize_wo_join_buffering but the problem is that providing save
  2876. space uses too much space. Instead, we will re-calculate the
  2877. alternate POSITIONs after we've picked the best QEP.
  2878. */
  2879. *handled_fanout= firstmatch_need_tables;
  2880. /* *record_count and *read_time were set by the above call */
  2881. *strategy= SJ_OPT_FIRST_MATCH;
  2882. if (unlikely(join->thd->trace_started()))
  2883. {
  2884. trace.add("records", *record_count);
  2885. trace.add("read_time", *read_time);
  2886. }
  2887. return TRUE;
  2888. }
  2889. }
  2890. }
  2891. else
  2892. invalidate_firstmatch_prefix();
  2893. return FALSE;
  2894. }
  2895. void Duplicate_weedout_picker::set_from_prev(POSITION *prev)
  2896. {
  2897. if (prev->dups_weedout_picker.is_used)
  2898. set_empty();
  2899. else
  2900. {
  2901. dupsweedout_tables= prev->dups_weedout_picker.dupsweedout_tables;
  2902. first_dupsweedout_table= prev->dups_weedout_picker.first_dupsweedout_table;
  2903. }
  2904. is_used= FALSE;
  2905. }
  2906. bool Duplicate_weedout_picker::check_qep(JOIN *join,
  2907. uint idx,
  2908. table_map remaining_tables,
  2909. const JOIN_TAB *new_join_tab,
  2910. double *record_count,
  2911. double *read_time,
  2912. table_map *handled_fanout,
  2913. sj_strategy_enum *strategy,
  2914. POSITION *loose_scan_pos
  2915. )
  2916. {
  2917. TABLE_LIST *nest;
  2918. if ((nest= new_join_tab->emb_sj_nest))
  2919. {
  2920. if (!dupsweedout_tables)
  2921. first_dupsweedout_table= idx;
  2922. dupsweedout_tables |= nest->sj_inner_tables |
  2923. nest->nested_join->sj_depends_on |
  2924. nest->nested_join->sj_corr_tables;
  2925. }
  2926. if (dupsweedout_tables)
  2927. {
  2928. /* we're in the process of constructing a DuplicateWeedout range */
  2929. TABLE_LIST *emb= new_join_tab->table->pos_in_table_list->embedding;
  2930. /* and we've entered an inner side of an outer join*/
  2931. if (emb && emb->on_expr)
  2932. dupsweedout_tables |= emb->nested_join->used_tables;
  2933. }
  2934. /* If this is the last table that we need for DuplicateWeedout range */
  2935. if (dupsweedout_tables && !(remaining_tables & ~new_join_tab->table->map &
  2936. dupsweedout_tables))
  2937. {
  2938. /*
  2939. Ok, reached a state where we could put a dups weedout point.
  2940. Walk back and calculate
  2941. - the join cost (this is needed as the accumulated cost may assume
  2942. some other duplicate elimination method)
  2943. - extra fanout that will be removed by duplicate elimination
  2944. - duplicate elimination cost
  2945. There are two cases:
  2946. 1. We have other strategy/ies to remove all of the duplicates.
  2947. 2. We don't.
  2948. We need to calculate the cost in case #2 also because we need to make
  2949. choice between this join order and others.
  2950. */
  2951. uint first_tab= first_dupsweedout_table;
  2952. double dups_cost;
  2953. double prefix_rec_count;
  2954. double sj_inner_fanout= 1.0;
  2955. double sj_outer_fanout= 1.0;
  2956. uint temptable_rec_size;
  2957. Json_writer_object trace(join->thd);
  2958. trace.add("strategy", "DuplicateWeedout");
  2959. if (first_tab == join->const_tables)
  2960. {
  2961. prefix_rec_count= 1.0;
  2962. temptable_rec_size= 0;
  2963. dups_cost= 0.0;
  2964. }
  2965. else
  2966. {
  2967. dups_cost= join->positions[first_tab - 1].prefix_cost.total_cost();
  2968. prefix_rec_count= join->positions[first_tab - 1].prefix_record_count;
  2969. temptable_rec_size= 8; /* This is not true but we'll make it so */
  2970. }
  2971. table_map dups_removed_fanout= 0;
  2972. double current_fanout= prefix_rec_count;
  2973. for (uint j= first_dupsweedout_table; j <= idx; j++)
  2974. {
  2975. POSITION *p= join->positions + j;
  2976. current_fanout= COST_MULT(current_fanout, p->records_read);
  2977. dups_cost= COST_ADD(dups_cost,
  2978. COST_ADD(p->read_time,
  2979. current_fanout / TIME_FOR_COMPARE));
  2980. if (p->table->emb_sj_nest)
  2981. {
  2982. sj_inner_fanout= COST_MULT(sj_inner_fanout, p->records_read);
  2983. dups_removed_fanout |= p->table->table->map;
  2984. }
  2985. else
  2986. {
  2987. sj_outer_fanout= COST_MULT(sj_outer_fanout, p->records_read);
  2988. temptable_rec_size += p->table->table->file->ref_length;
  2989. }
  2990. }
  2991. /*
  2992. Add the cost of temptable use. The table will have sj_outer_fanout
  2993. records, and we will make
  2994. - sj_outer_fanout table writes
  2995. - sj_inner_fanout*sj_outer_fanout lookups.
  2996. */
  2997. double one_lookup_cost= get_tmp_table_lookup_cost(join->thd,
  2998. sj_outer_fanout,
  2999. temptable_rec_size);
  3000. double one_write_cost= get_tmp_table_write_cost(join->thd,
  3001. sj_outer_fanout,
  3002. temptable_rec_size);
  3003. double write_cost= COST_MULT(join->positions[first_tab].prefix_record_count,
  3004. sj_outer_fanout * one_write_cost);
  3005. double full_lookup_cost=
  3006. COST_MULT(join->positions[first_tab].prefix_record_count,
  3007. COST_MULT(sj_outer_fanout,
  3008. sj_inner_fanout * one_lookup_cost));
  3009. dups_cost= COST_ADD(dups_cost, COST_ADD(write_cost, full_lookup_cost));
  3010. *read_time= dups_cost;
  3011. *record_count= prefix_rec_count * sj_outer_fanout;
  3012. *handled_fanout= dups_removed_fanout;
  3013. *strategy= SJ_OPT_DUPS_WEEDOUT;
  3014. if (unlikely(join->thd->trace_started()))
  3015. {
  3016. trace.add("records", *record_count);
  3017. trace.add("read_time", *read_time);
  3018. }
  3019. return TRUE;
  3020. }
  3021. return FALSE;
  3022. }
  3023. /*
  3024. Remove the last join tab from from join->cur_sj_inner_tables bitmap
  3025. we assume remaining_tables doesnt contain @tab.
  3026. */
  3027. void restore_prev_sj_state(const table_map remaining_tables,
  3028. const JOIN_TAB *tab, uint idx)
  3029. {
  3030. TABLE_LIST *emb_sj_nest;
  3031. if (tab->emb_sj_nest)
  3032. {
  3033. table_map subq_tables= tab->emb_sj_nest->sj_inner_tables;
  3034. tab->join->sjm_lookup_tables &= ~subq_tables;
  3035. }
  3036. if ((emb_sj_nest= tab->emb_sj_nest))
  3037. {
  3038. /* If we're removing the last SJ-inner table, remove the sj-nest */
  3039. if ((remaining_tables & emb_sj_nest->sj_inner_tables) ==
  3040. (emb_sj_nest->sj_inner_tables & ~tab->table->map))
  3041. {
  3042. tab->join->cur_sj_inner_tables &= ~emb_sj_nest->sj_inner_tables;
  3043. }
  3044. }
  3045. }
  3046. /*
  3047. Given a semi-join nest, find out which of the IN-equalities are bound
  3048. SYNOPSIS
  3049. get_bound_sj_equalities()
  3050. sj_nest Semi-join nest
  3051. remaining_tables Tables that are not yet bound
  3052. DESCRIPTION
  3053. Given a semi-join nest, find out which of the IN-equalities have their
  3054. left part expression bound (i.e. the said expression doesn't refer to
  3055. any of remaining_tables and can be evaluated).
  3056. RETURN
  3057. Bitmap of bound IN-equalities.
  3058. */
  3059. ulonglong get_bound_sj_equalities(TABLE_LIST *sj_nest,
  3060. table_map remaining_tables)
  3061. {
  3062. List_iterator<Item_ptr> li(sj_nest->nested_join->sj_outer_expr_list);
  3063. Item **item;
  3064. uint i= 0;
  3065. ulonglong res= 0;
  3066. while ((item= li++))
  3067. {
  3068. /*
  3069. Q: should this take into account equality propagation and how?
  3070. A: If e->outer_side is an Item_field, walk over the equality
  3071. class and see if there is an element that is bound?
  3072. (this is an optional feature)
  3073. */
  3074. if (!(item[0]->used_tables() & remaining_tables))
  3075. {
  3076. res |= 1ULL << i;
  3077. }
  3078. i++;
  3079. }
  3080. return res;
  3081. }
  3082. /*
  3083. Check if the last tables of the partial join order allow to use
  3084. sj-materialization strategy for them
  3085. SYNOPSIS
  3086. at_sjmat_pos()
  3087. join
  3088. remaining_tables
  3089. tab the last table's join tab
  3090. idx last table's index
  3091. loose_scan OUT TRUE <=> use LooseScan
  3092. RETURN
  3093. TRUE Yes, can apply sj-materialization
  3094. FALSE No, some of the requirements are not met
  3095. */
  3096. static SJ_MATERIALIZATION_INFO *
  3097. at_sjmat_pos(const JOIN *join, table_map remaining_tables, const JOIN_TAB *tab,
  3098. uint idx, bool *loose_scan)
  3099. {
  3100. /*
  3101. Check if
  3102. 1. We're in a semi-join nest that can be run with SJ-materialization
  3103. 2. All the tables correlated through the IN subquery are in the prefix
  3104. */
  3105. TABLE_LIST *emb_sj_nest= tab->emb_sj_nest;
  3106. table_map suffix= remaining_tables & ~tab->table->map;
  3107. if (emb_sj_nest && emb_sj_nest->sj_mat_info &&
  3108. !(suffix & emb_sj_nest->sj_inner_tables))
  3109. {
  3110. /*
  3111. Walk back and check if all immediately preceding tables are from
  3112. this semi-join.
  3113. */
  3114. uint n_tables= my_count_bits(tab->emb_sj_nest->sj_inner_tables);
  3115. for (uint i= 1; i < n_tables ; i++)
  3116. {
  3117. if (join->positions[idx - i].table->emb_sj_nest != tab->emb_sj_nest)
  3118. return NULL;
  3119. }
  3120. *loose_scan= MY_TEST(remaining_tables & ~tab->table->map &
  3121. (emb_sj_nest->sj_inner_tables |
  3122. emb_sj_nest->nested_join->sj_depends_on));
  3123. if (*loose_scan && !emb_sj_nest->sj_subq_pred->sjm_scan_allowed)
  3124. return NULL;
  3125. else
  3126. return emb_sj_nest->sj_mat_info;
  3127. }
  3128. return NULL;
  3129. }
  3130. /*
  3131. Re-calculate values of join->best_positions[start..end].prefix_record_count
  3132. */
  3133. static void recalculate_prefix_record_count(JOIN *join, uint start, uint end)
  3134. {
  3135. for (uint j= start; j < end ;j++)
  3136. {
  3137. double prefix_count;
  3138. if (j == join->const_tables)
  3139. prefix_count= 1.0;
  3140. else
  3141. prefix_count= COST_MULT(join->best_positions[j-1].prefix_record_count,
  3142. join->best_positions[j-1].records_read);
  3143. join->best_positions[j].prefix_record_count= prefix_count;
  3144. }
  3145. }
  3146. /*
  3147. Fix semi-join strategies for the picked join order
  3148. SYNOPSIS
  3149. fix_semijoin_strategies_for_picked_join_order()
  3150. join The join with the picked join order
  3151. DESCRIPTION
  3152. Fix semi-join strategies for the picked join order. This is a step that
  3153. needs to be done right after we have fixed the join order. What we do
  3154. here is switch join's semi-join strategy description from backward-based
  3155. to forwards based.
  3156. When join optimization is in progress, we re-consider semi-join
  3157. strategies after we've added another table. Here's an illustration.
  3158. Suppose the join optimization is underway:
  3159. 1) ot1 it1 it2
  3160. sjX -- looking at (ot1, it1, it2) join prefix, we decide
  3161. to use semi-join strategy sjX.
  3162. 2) ot1 it1 it2 ot2
  3163. sjX sjY -- Having added table ot2, we now may consider
  3164. another semi-join strategy and decide to use a
  3165. different strategy sjY. Note that the record
  3166. of sjX has remained under it2. That is
  3167. necessary because we need to be able to get
  3168. back to (ot1, it1, it2) join prefix.
  3169. what makes things even worse is that there are cases where the choice
  3170. of sjY changes the way we should access it2.
  3171. 3) [ot1 it1 it2 ot2 ot3]
  3172. sjX sjY -- This means that after join optimization is
  3173. finished, semi-join info should be read
  3174. right-to-left (while nearly all plan refinement
  3175. functions, EXPLAIN, etc proceed from left to
  3176. right)
  3177. This function does the needed reversal, making it possible to read the
  3178. join and semi-join order from left to right.
  3179. */
  3180. void fix_semijoin_strategies_for_picked_join_order(JOIN *join)
  3181. {
  3182. uint table_count=join->table_count;
  3183. uint tablenr;
  3184. table_map remaining_tables= 0;
  3185. table_map handled_tabs= 0;
  3186. join->sjm_lookup_tables= 0;
  3187. join->sjm_scan_tables= 0;
  3188. THD *thd= join->thd;
  3189. if (!join->select_lex->sj_nests.elements)
  3190. return;
  3191. Json_writer_object trace_wrapper(thd);
  3192. Json_writer_array trace_semijoin_strategies(thd,
  3193. "fix_semijoin_strategies_for_picked_join_order");
  3194. for (tablenr= table_count - 1 ; tablenr != join->const_tables - 1; tablenr--)
  3195. {
  3196. POSITION *pos= join->best_positions + tablenr;
  3197. JOIN_TAB *s= pos->table;
  3198. uint UNINIT_VAR(first); // Set by every branch except SJ_OPT_NONE which doesn't use it
  3199. if ((handled_tabs & s->table->map) || pos->sj_strategy == SJ_OPT_NONE)
  3200. {
  3201. remaining_tables |= s->table->map;
  3202. continue;
  3203. }
  3204. if (pos->sj_strategy == SJ_OPT_MATERIALIZE)
  3205. {
  3206. SJ_MATERIALIZATION_INFO *sjm= s->emb_sj_nest->sj_mat_info;
  3207. sjm->is_used= TRUE;
  3208. sjm->is_sj_scan= FALSE;
  3209. memcpy((uchar*) (pos - sjm->tables + 1), (uchar*) sjm->positions,
  3210. sizeof(POSITION) * sjm->tables);
  3211. recalculate_prefix_record_count(join, tablenr - sjm->tables + 1,
  3212. tablenr);
  3213. first= tablenr - sjm->tables + 1;
  3214. join->best_positions[first].n_sj_tables= sjm->tables;
  3215. join->best_positions[first].sj_strategy= SJ_OPT_MATERIALIZE;
  3216. Json_writer_object semijoin_strategy(thd);
  3217. semijoin_strategy.add("semi_join_strategy","SJ-Materialization");
  3218. Json_writer_array semijoin_plan(thd, "join_order");
  3219. for (uint i= first; i < first+ sjm->tables; i++)
  3220. {
  3221. if (unlikely(thd->trace_started()))
  3222. {
  3223. Json_writer_object trace_one_table(thd);
  3224. trace_one_table.add_table_name(join->best_positions[i].table);
  3225. }
  3226. join->sjm_lookup_tables |= join->best_positions[i].table->table->map;
  3227. }
  3228. }
  3229. else if (pos->sj_strategy == SJ_OPT_MATERIALIZE_SCAN)
  3230. {
  3231. POSITION *first_inner= join->best_positions + pos->sjmat_picker.sjm_scan_last_inner;
  3232. SJ_MATERIALIZATION_INFO *sjm= first_inner->table->emb_sj_nest->sj_mat_info;
  3233. sjm->is_used= TRUE;
  3234. sjm->is_sj_scan= TRUE;
  3235. first= pos->sjmat_picker.sjm_scan_last_inner - sjm->tables + 1;
  3236. memcpy((uchar*) (join->best_positions + first),
  3237. (uchar*) sjm->positions, sizeof(POSITION) * sjm->tables);
  3238. recalculate_prefix_record_count(join, first, first + sjm->tables);
  3239. join->best_positions[first].sj_strategy= SJ_OPT_MATERIALIZE_SCAN;
  3240. join->best_positions[first].n_sj_tables= sjm->tables;
  3241. /*
  3242. Do what advance_sj_state did: re-run best_access_path for every table
  3243. in the [last_inner_table + 1; pos..) range
  3244. */
  3245. double prefix_rec_count;
  3246. /* Get the prefix record count */
  3247. if (first == join->const_tables)
  3248. prefix_rec_count= 1.0;
  3249. else
  3250. prefix_rec_count= join->best_positions[first-1].prefix_record_count;
  3251. /* Add materialization record count*/
  3252. prefix_rec_count *= sjm->rows;
  3253. uint i;
  3254. table_map rem_tables= remaining_tables;
  3255. for (i= tablenr; i != (first + sjm->tables - 1); i--)
  3256. rem_tables |= join->best_positions[i].table->table->map;
  3257. for (i= first; i < first+ sjm->tables; i++)
  3258. join->sjm_scan_tables |= join->best_positions[i].table->table->map;
  3259. POSITION dummy;
  3260. join->cur_sj_inner_tables= 0;
  3261. Json_writer_object semijoin_strategy(thd);
  3262. semijoin_strategy.add("semi_join_strategy","SJ-Materialization-Scan");
  3263. Json_writer_array semijoin_plan(thd, "join_order");
  3264. for (i= first + sjm->tables; i <= tablenr; i++)
  3265. {
  3266. if (unlikely(thd->trace_started()))
  3267. {
  3268. Json_writer_object trace_one_table(thd);
  3269. trace_one_table.add_table_name(join->best_positions[i].table);
  3270. }
  3271. best_access_path(join, join->best_positions[i].table, rem_tables, i,
  3272. FALSE, prefix_rec_count,
  3273. join->best_positions + i, &dummy);
  3274. prefix_rec_count *= join->best_positions[i].records_read;
  3275. rem_tables &= ~join->best_positions[i].table->table->map;
  3276. }
  3277. }
  3278. if (pos->sj_strategy == SJ_OPT_FIRST_MATCH)
  3279. {
  3280. first= pos->firstmatch_picker.first_firstmatch_table;
  3281. join->best_positions[first].sj_strategy= SJ_OPT_FIRST_MATCH;
  3282. join->best_positions[first].n_sj_tables= tablenr - first + 1;
  3283. POSITION dummy; // For loose scan paths
  3284. double record_count= (first== join->const_tables)? 1.0:
  3285. join->best_positions[tablenr - 1].prefix_record_count;
  3286. table_map rem_tables= remaining_tables;
  3287. uint idx;
  3288. for (idx= first; idx <= tablenr; idx++)
  3289. {
  3290. rem_tables |= join->best_positions[idx].table->table->map;
  3291. }
  3292. /*
  3293. Re-run best_access_path to produce best access methods that do not use
  3294. join buffering
  3295. */
  3296. join->cur_sj_inner_tables= 0;
  3297. Json_writer_object semijoin_strategy(thd);
  3298. semijoin_strategy.add("semi_join_strategy","FirstMatch");
  3299. Json_writer_array semijoin_plan(thd, "join_order");
  3300. for (idx= first; idx <= tablenr; idx++)
  3301. {
  3302. if (unlikely(thd->trace_started()))
  3303. {
  3304. Json_writer_object trace_one_table(thd);
  3305. trace_one_table.add_table_name(join->best_positions[idx].table);
  3306. }
  3307. if (join->best_positions[idx].use_join_buffer)
  3308. {
  3309. best_access_path(join, join->best_positions[idx].table,
  3310. rem_tables, idx, TRUE /* no jbuf */,
  3311. record_count, join->best_positions + idx, &dummy);
  3312. }
  3313. record_count *= join->best_positions[idx].records_read;
  3314. rem_tables &= ~join->best_positions[idx].table->table->map;
  3315. }
  3316. }
  3317. if (pos->sj_strategy == SJ_OPT_LOOSE_SCAN)
  3318. {
  3319. first= pos->loosescan_picker.first_loosescan_table;
  3320. POSITION *first_pos= join->best_positions + first;
  3321. POSITION loose_scan_pos; // For loose scan paths
  3322. double record_count= (first== join->const_tables)? 1.0:
  3323. join->best_positions[tablenr - 1].prefix_record_count;
  3324. table_map rem_tables= remaining_tables;
  3325. uint idx;
  3326. for (idx= first; idx <= tablenr; idx++)
  3327. rem_tables |= join->best_positions[idx].table->table->map;
  3328. /*
  3329. Re-run best_access_path to produce best access methods that do not use
  3330. join buffering
  3331. */
  3332. join->cur_sj_inner_tables= 0;
  3333. Json_writer_object semijoin_strategy(thd);
  3334. semijoin_strategy.add("semi_join_strategy","LooseScan");
  3335. Json_writer_array semijoin_plan(thd, "join_order");
  3336. for (idx= first; idx <= tablenr; idx++)
  3337. {
  3338. if (unlikely(thd->trace_started()))
  3339. {
  3340. Json_writer_object trace_one_table(thd);
  3341. trace_one_table.add_table_name(join->best_positions[idx].table);
  3342. }
  3343. if (join->best_positions[idx].use_join_buffer || (idx == first))
  3344. {
  3345. best_access_path(join, join->best_positions[idx].table,
  3346. rem_tables, idx, TRUE /* no jbuf */,
  3347. record_count, join->best_positions + idx,
  3348. &loose_scan_pos);
  3349. if (idx==first)
  3350. {
  3351. join->best_positions[idx]= loose_scan_pos;
  3352. /*
  3353. If LooseScan is based on ref access (including the "degenerate"
  3354. one with 0 key parts), we should use full index scan.
  3355. Unfortunately, lots of code assumes that if tab->type==JT_ALL &&
  3356. tab->quick!=NULL, then quick select should be used. The only
  3357. simple way to fix this is to remove the quick select:
  3358. */
  3359. if (join->best_positions[idx].key)
  3360. {
  3361. delete join->best_positions[idx].table->quick;
  3362. join->best_positions[idx].table->quick= NULL;
  3363. }
  3364. }
  3365. }
  3366. rem_tables &= ~join->best_positions[idx].table->table->map;
  3367. record_count *= join->best_positions[idx].records_read;
  3368. }
  3369. first_pos->sj_strategy= SJ_OPT_LOOSE_SCAN;
  3370. first_pos->n_sj_tables= my_count_bits(first_pos->table->emb_sj_nest->sj_inner_tables);
  3371. }
  3372. if (pos->sj_strategy == SJ_OPT_DUPS_WEEDOUT)
  3373. {
  3374. /*
  3375. Duplicate Weedout starting at pos->first_dupsweedout_table, ending at
  3376. this table.
  3377. */
  3378. first= pos->dups_weedout_picker.first_dupsweedout_table;
  3379. join->best_positions[first].sj_strategy= SJ_OPT_DUPS_WEEDOUT;
  3380. join->best_positions[first].n_sj_tables= tablenr - first + 1;
  3381. }
  3382. uint i_end= first + join->best_positions[first].n_sj_tables;
  3383. for (uint i= first; i < i_end; i++)
  3384. {
  3385. if (i != first)
  3386. join->best_positions[i].sj_strategy= SJ_OPT_NONE;
  3387. handled_tabs |= join->best_positions[i].table->table->map;
  3388. }
  3389. if (tablenr != first)
  3390. pos->sj_strategy= SJ_OPT_NONE;
  3391. remaining_tables |= s->table->map;
  3392. join->join_tab[first].sj_strategy= join->best_positions[first].sj_strategy;
  3393. join->join_tab[first].n_sj_tables= join->best_positions[first].n_sj_tables;
  3394. }
  3395. }
  3396. /*
  3397. Setup semi-join materialization strategy for one semi-join nest
  3398. SYNOPSIS
  3399. setup_sj_materialization()
  3400. tab The first tab in the semi-join
  3401. DESCRIPTION
  3402. Setup execution structures for one semi-join materialization nest:
  3403. - Create the materialization temporary table
  3404. - If we're going to do index lookups
  3405. create TABLE_REF structure to make the lookus
  3406. - else (if we're going to do a full scan of the temptable)
  3407. create Copy_field structures to do copying.
  3408. RETURN
  3409. FALSE Ok
  3410. TRUE Error
  3411. */
  3412. bool setup_sj_materialization_part1(JOIN_TAB *sjm_tab)
  3413. {
  3414. JOIN_TAB *tab= sjm_tab->bush_children->start;
  3415. TABLE_LIST *emb_sj_nest= tab->table->pos_in_table_list->embedding;
  3416. SJ_MATERIALIZATION_INFO *sjm;
  3417. THD *thd;
  3418. DBUG_ENTER("setup_sj_materialization");
  3419. /* Walk out of outer join nests until we reach the semi-join nest we're in */
  3420. while (!emb_sj_nest->sj_mat_info)
  3421. emb_sj_nest= emb_sj_nest->embedding;
  3422. sjm= emb_sj_nest->sj_mat_info;
  3423. thd= tab->join->thd;
  3424. /* First the calls come to the materialization function */
  3425. DBUG_ASSERT(sjm->is_used);
  3426. /*
  3427. Set up the table to write to, do as select_union::create_result_table does
  3428. */
  3429. sjm->sjm_table_param.init();
  3430. sjm->sjm_table_param.bit_fields_as_long= TRUE;
  3431. SELECT_LEX *subq_select= emb_sj_nest->sj_subq_pred->unit->first_select();
  3432. const LEX_CSTRING sj_materialize_name= { STRING_WITH_LEN("sj-materialize") };
  3433. List_iterator<Item> it(subq_select->item_list);
  3434. Item *item;
  3435. while((item= it++))
  3436. {
  3437. /*
  3438. This semi-join replaced the subquery (subq_select) and so on
  3439. re-executing it will not be prepared. To use the Items from its
  3440. select list we have to prepare (fix_fields) them
  3441. */
  3442. if (item->fix_fields_if_needed(thd, it.ref()))
  3443. DBUG_RETURN(TRUE);
  3444. item= *(it.ref()); // it can be changed by fix_fields
  3445. DBUG_ASSERT(!item->name.length || item->name.length == strlen(item->name.str));
  3446. sjm->sjm_table_cols.push_back(item, thd->mem_root);
  3447. }
  3448. sjm->sjm_table_param.field_count= subq_select->item_list.elements;
  3449. sjm->sjm_table_param.force_not_null_cols= TRUE;
  3450. if (!(sjm->table= create_tmp_table(thd, &sjm->sjm_table_param,
  3451. sjm->sjm_table_cols, (ORDER*) 0,
  3452. TRUE /* distinct */,
  3453. 1, /*save_sum_fields*/
  3454. thd->variables.option_bits | TMP_TABLE_ALL_COLUMNS,
  3455. HA_POS_ERROR /*rows_limit */,
  3456. &sj_materialize_name)))
  3457. DBUG_RETURN(TRUE); /* purecov: inspected */
  3458. sjm->table->map= emb_sj_nest->nested_join->used_tables;
  3459. sjm->table->file->extra(HA_EXTRA_WRITE_CACHE);
  3460. sjm->table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
  3461. tab->join->sj_tmp_tables.push_back(sjm->table, thd->mem_root);
  3462. tab->join->sjm_info_list.push_back(sjm, thd->mem_root);
  3463. sjm->materialized= FALSE;
  3464. sjm_tab->table= sjm->table;
  3465. sjm->table->pos_in_table_list= emb_sj_nest;
  3466. DBUG_RETURN(FALSE);
  3467. }
  3468. /**
  3469. @retval
  3470. FALSE ok
  3471. TRUE error
  3472. */
  3473. bool setup_sj_materialization_part2(JOIN_TAB *sjm_tab)
  3474. {
  3475. DBUG_ENTER("setup_sj_materialization_part2");
  3476. JOIN_TAB *tab= sjm_tab->bush_children->start;
  3477. TABLE_LIST *emb_sj_nest= tab->table->pos_in_table_list->embedding;
  3478. /* Walk out of outer join nests until we reach the semi-join nest we're in */
  3479. while (!emb_sj_nest->sj_mat_info)
  3480. emb_sj_nest= emb_sj_nest->embedding;
  3481. SJ_MATERIALIZATION_INFO *sjm= emb_sj_nest->sj_mat_info;
  3482. THD *thd= tab->join->thd;
  3483. uint i;
  3484. if (!sjm->is_sj_scan)
  3485. {
  3486. KEY *tmp_key; /* The only index on the temporary table. */
  3487. uint tmp_key_parts; /* Number of keyparts in tmp_key. */
  3488. tmp_key= sjm->table->key_info;
  3489. tmp_key_parts= tmp_key->user_defined_key_parts;
  3490. /*
  3491. Create/initialize everything we will need to index lookups into the
  3492. temptable.
  3493. */
  3494. TABLE_REF *tab_ref;
  3495. tab_ref= &sjm_tab->ref;
  3496. tab_ref->key= 0; /* The only temp table index. */
  3497. tab_ref->key_length= tmp_key->key_length;
  3498. if (!(tab_ref->key_buff=
  3499. (uchar*) thd->calloc(ALIGN_SIZE(tmp_key->key_length) * 2)) ||
  3500. !(tab_ref->key_copy=
  3501. (store_key**) thd->alloc((sizeof(store_key*) *
  3502. (tmp_key_parts + 1)))) ||
  3503. !(tab_ref->items=
  3504. (Item**) thd->alloc(sizeof(Item*) * tmp_key_parts)))
  3505. DBUG_RETURN(TRUE); /* purecov: inspected */
  3506. tab_ref->key_buff2=tab_ref->key_buff+ALIGN_SIZE(tmp_key->key_length);
  3507. tab_ref->key_err=1;
  3508. tab_ref->null_rejecting= 1;
  3509. tab_ref->disable_cache= FALSE;
  3510. KEY_PART_INFO *cur_key_part= tmp_key->key_part;
  3511. store_key **ref_key= tab_ref->key_copy;
  3512. uchar *cur_ref_buff= tab_ref->key_buff;
  3513. for (i= 0; i < tmp_key_parts; i++, cur_key_part++, ref_key++)
  3514. {
  3515. tab_ref->items[i]= emb_sj_nest->sj_subq_pred->left_expr->element_index(i);
  3516. int null_count= MY_TEST(cur_key_part->field->real_maybe_null());
  3517. *ref_key= new store_key_item(thd, cur_key_part->field,
  3518. /* TODO:
  3519. the NULL byte is taken into account in
  3520. cur_key_part->store_length, so instead of
  3521. cur_ref_buff + MY_TEST(maybe_null), we could
  3522. use that information instead.
  3523. */
  3524. cur_ref_buff + null_count,
  3525. null_count ? cur_ref_buff : 0,
  3526. cur_key_part->length, tab_ref->items[i],
  3527. FALSE);
  3528. if (!*ref_key)
  3529. DBUG_RETURN(TRUE);
  3530. cur_ref_buff+= cur_key_part->store_length;
  3531. }
  3532. *ref_key= NULL; /* End marker. */
  3533. /*
  3534. We don't ever have guarded conditions for SJM tables, but code at SQL
  3535. layer depends on cond_guards array being alloced.
  3536. */
  3537. if (!(tab_ref->cond_guards= (bool**) thd->calloc(sizeof(uint*)*tmp_key_parts)))
  3538. {
  3539. DBUG_RETURN(TRUE);
  3540. }
  3541. tab_ref->key_err= 1;
  3542. tab_ref->key_parts= tmp_key_parts;
  3543. sjm->tab_ref= tab_ref;
  3544. /*
  3545. Remove the injected semi-join IN-equalities from join_tab conds. This
  3546. needs to be done because the IN-equalities refer to columns of
  3547. sj-inner tables which are not available after the materialization
  3548. has been finished.
  3549. */
  3550. for (i= 0; i < sjm->tables; i++)
  3551. {
  3552. if (remove_sj_conds(thd, &tab[i].select_cond) ||
  3553. (tab[i].select && remove_sj_conds(thd, &tab[i].select->cond)))
  3554. DBUG_RETURN(TRUE);
  3555. }
  3556. if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
  3557. emb_sj_nest->sj_subq_pred)))
  3558. DBUG_RETURN(TRUE); /* purecov: inspected */
  3559. sjm_tab->type= JT_EQ_REF;
  3560. sjm_tab->select_cond= sjm->in_equality;
  3561. }
  3562. else
  3563. {
  3564. /*
  3565. We'll be doing full scan of the temptable.
  3566. Setup copying of temptable columns back to the record buffers
  3567. for their source tables. We need this because IN-equalities
  3568. refer to the original tables.
  3569. EXAMPLE
  3570. Consider the query:
  3571. SELECT * FROM ot WHERE ot.col1 IN (SELECT it.col2 FROM it)
  3572. Suppose it's executed with SJ-Materialization-scan. We choose to do scan
  3573. if we can't do the lookup, i.e. the join order is (it, ot). The plan
  3574. would look as follows:
  3575. table access method condition
  3576. it materialize+scan -
  3577. ot (whatever) ot1.col1=it.col2 (C2)
  3578. The condition C2 refers to current row of table it. The problem is
  3579. that by the time we evaluate C2, we would have finished with scanning
  3580. it itself and will be scanning the temptable.
  3581. At the moment, our solution is to copy back: when we get the next
  3582. temptable record, we copy its columns to their corresponding columns
  3583. in the record buffers for the source tables.
  3584. */
  3585. if (!(sjm->copy_field= new Copy_field[sjm->sjm_table_cols.elements]))
  3586. DBUG_RETURN(TRUE);
  3587. //it.rewind();
  3588. Ref_ptr_array p_items= emb_sj_nest->sj_subq_pred->unit->first_select()->ref_pointer_array;
  3589. for (uint i=0; i < sjm->sjm_table_cols.elements; i++)
  3590. {
  3591. bool dummy;
  3592. Item_equal *item_eq;
  3593. //Item *item= (it++)->real_item();
  3594. Item *item= p_items[i]->real_item();
  3595. DBUG_ASSERT(item->type() == Item::FIELD_ITEM);
  3596. Field *copy_to= ((Item_field*)item)->field;
  3597. /*
  3598. Tricks with Item_equal are due to the following: suppose we have a
  3599. query:
  3600. ... WHERE cond(ot.col) AND ot.col IN (SELECT it2.col FROM it1,it2
  3601. WHERE it1.col= it2.col)
  3602. then equality propagation will create an
  3603. Item_equal(it1.col, it2.col, ot.col)
  3604. then substitute_for_best_equal_field() will change the conditions
  3605. according to the join order:
  3606. table | attached condition
  3607. ------+--------------------
  3608. it1 |
  3609. it2 | it1.col=it2.col
  3610. ot | cond(it1.col)
  3611. although we've originally had "SELECT it2.col", conditions attached
  3612. to subsequent outer tables will refer to it1.col, so SJM-Scan will
  3613. need to unpack data to there.
  3614. That is, if an element from subquery's select list participates in
  3615. equality propagation, then we need to unpack it to the first
  3616. element equality propagation member that refers to table that is
  3617. within the subquery.
  3618. */
  3619. item_eq= find_item_equal(tab->join->cond_equal, copy_to, &dummy);
  3620. if (item_eq)
  3621. {
  3622. List_iterator<Item> it(item_eq->equal_items);
  3623. /* We're interested in field items only */
  3624. if (item_eq->get_const())
  3625. it++;
  3626. Item *item;
  3627. while ((item= it++))
  3628. {
  3629. if (!(item->used_tables() & ~emb_sj_nest->sj_inner_tables))
  3630. {
  3631. DBUG_ASSERT(item->real_item()->type() == Item::FIELD_ITEM);
  3632. copy_to= ((Item_field *) (item->real_item()))->field;
  3633. break;
  3634. }
  3635. }
  3636. }
  3637. sjm->copy_field[i].set(copy_to, sjm->table->field[i], FALSE);
  3638. /* The write_set for source tables must be set up to allow the copying */
  3639. bitmap_set_bit(copy_to->table->write_set, copy_to->field_index);
  3640. }
  3641. sjm_tab->type= JT_ALL;
  3642. /* Initialize full scan */
  3643. sjm_tab->read_first_record= join_read_record_no_init;
  3644. sjm_tab->read_record.copy_field= sjm->copy_field;
  3645. sjm_tab->read_record.copy_field_end= sjm->copy_field +
  3646. sjm->sjm_table_cols.elements;
  3647. sjm_tab->read_record.read_record_func= rr_sequential_and_unpack;
  3648. }
  3649. sjm_tab->bush_children->end[-1].next_select= end_sj_materialize;
  3650. DBUG_RETURN(FALSE);
  3651. }
  3652. /*
  3653. Create subquery IN-equalities assuming use of materialization strategy
  3654. SYNOPSIS
  3655. create_subq_in_equalities()
  3656. thd Thread handle
  3657. sjm Semi-join materialization structure
  3658. subq_pred The subquery predicate
  3659. DESCRIPTION
  3660. Create subquery IN-equality predicates. That is, for a subquery
  3661. (oe1, oe2, ...) IN (SELECT ie1, ie2, ... FROM ...)
  3662. create "oe1=ie1 AND ie1=ie2 AND ..." expression, such that ie1, ie2, ..
  3663. refer to the columns of the table that's used to materialize the
  3664. subquery.
  3665. RETURN
  3666. Created condition
  3667. */
  3668. static Item *create_subq_in_equalities(THD *thd, SJ_MATERIALIZATION_INFO *sjm,
  3669. Item_in_subselect *subq_pred)
  3670. {
  3671. Item *res= NULL;
  3672. if (subq_pred->left_expr->cols() == 1)
  3673. {
  3674. if (!(res= new (thd->mem_root) Item_func_eq(thd, subq_pred->left_expr,
  3675. new (thd->mem_root) Item_field(thd, sjm->table->field[0]))))
  3676. return NULL; /* purecov: inspected */
  3677. }
  3678. else
  3679. {
  3680. Item *conj;
  3681. for (uint i= 0; i < subq_pred->left_expr->cols(); i++)
  3682. {
  3683. if (!(conj= new (thd->mem_root) Item_func_eq(thd, subq_pred->left_expr->element_index(i),
  3684. new (thd->mem_root) Item_field(thd, sjm->table->field[i]))) ||
  3685. !(res= and_items(thd, res, conj)))
  3686. return NULL; /* purecov: inspected */
  3687. }
  3688. }
  3689. if (res->fix_fields(thd, &res))
  3690. return NULL; /* purecov: inspected */
  3691. return res;
  3692. }
  3693. /**
  3694. @retval
  3695. 0 ok
  3696. 1 error
  3697. */
  3698. static bool remove_sj_conds(THD *thd, Item **tree)
  3699. {
  3700. if (*tree)
  3701. {
  3702. if (is_cond_sj_in_equality(*tree))
  3703. {
  3704. *tree= NULL;
  3705. return 0;
  3706. }
  3707. else if ((*tree)->type() == Item::COND_ITEM)
  3708. {
  3709. Item *item;
  3710. List_iterator<Item> li(*(((Item_cond*)*tree)->argument_list()));
  3711. while ((item= li++))
  3712. {
  3713. if (is_cond_sj_in_equality(item))
  3714. {
  3715. Item_int *tmp= new (thd->mem_root) Item_int(thd, 1);
  3716. if (!tmp)
  3717. return 1;
  3718. li.replace(tmp);
  3719. }
  3720. }
  3721. }
  3722. }
  3723. return 0;
  3724. }
  3725. /* Check if given Item was injected by semi-join equality */
  3726. static bool is_cond_sj_in_equality(Item *item)
  3727. {
  3728. if (item->type() == Item::FUNC_ITEM &&
  3729. ((Item_func*)item)->functype()== Item_func::EQ_FUNC)
  3730. {
  3731. Item_func_eq *item_eq= (Item_func_eq*)item;
  3732. return MY_TEST(item_eq->in_equality_no != UINT_MAX);
  3733. }
  3734. return FALSE;
  3735. }
  3736. /*
  3737. Create a temporary table to weed out duplicate rowid combinations
  3738. SYNOPSIS
  3739. create_sj_weedout_tmp_table()
  3740. thd Thread handle
  3741. DESCRIPTION
  3742. Create a temporary table to weed out duplicate rowid combinations. The
  3743. table has a single column that is a concatenation of all rowids in the
  3744. combination.
  3745. Depending on the needed length, there are two cases:
  3746. 1. When the length of the column < max_key_length:
  3747. CREATE TABLE tmp (col VARBINARY(n) NOT NULL, UNIQUE KEY(col));
  3748. 2. Otherwise (not a valid SQL syntax but internally supported):
  3749. CREATE TABLE tmp (col VARBINARY NOT NULL, UNIQUE CONSTRAINT(col));
  3750. The code in this function was produced by extraction of relevant parts
  3751. from create_tmp_table().
  3752. RETURN
  3753. created table
  3754. NULL on error
  3755. */
  3756. bool
  3757. SJ_TMP_TABLE::create_sj_weedout_tmp_table(THD *thd)
  3758. {
  3759. MEM_ROOT *mem_root_save, own_root;
  3760. TABLE *table;
  3761. TABLE_SHARE *share;
  3762. uint temp_pool_slot=MY_BIT_NONE;
  3763. char *tmpname,path[FN_REFLEN];
  3764. Field **reg_field;
  3765. KEY_PART_INFO *key_part_info;
  3766. KEY *keyinfo;
  3767. uchar *group_buff;
  3768. uchar *bitmaps;
  3769. uint *blob_field;
  3770. bool using_unique_constraint=FALSE;
  3771. bool use_packed_rows= FALSE;
  3772. Field *field, *key_field;
  3773. uint null_pack_length, null_count;
  3774. uchar *null_flags;
  3775. uchar *pos;
  3776. DBUG_ENTER("create_sj_weedout_tmp_table");
  3777. DBUG_ASSERT(!is_degenerate);
  3778. tmp_table= NULL;
  3779. uint uniq_tuple_length_arg= rowid_len + null_bytes;
  3780. /*
  3781. STEP 1: Get temporary table name
  3782. */
  3783. if (use_temp_pool && !(test_flags & TEST_KEEP_TMP_TABLES))
  3784. temp_pool_slot = bitmap_lock_set_next(&temp_pool);
  3785. if (temp_pool_slot != MY_BIT_NONE) // we got a slot
  3786. sprintf(path, "%s_%lx_%i", tmp_file_prefix,
  3787. current_pid, temp_pool_slot);
  3788. else
  3789. {
  3790. /* if we run out of slots or we are not using tempool */
  3791. sprintf(path,"%s%lx_%lx_%x", tmp_file_prefix,current_pid,
  3792. (ulong) thd->thread_id, thd->tmp_table++);
  3793. }
  3794. fn_format(path, path, mysql_tmpdir, "", MY_REPLACE_EXT|MY_UNPACK_FILENAME);
  3795. /* STEP 2: Figure if we'll be using a key or blob+constraint */
  3796. /* it always has my_charset_bin, so mbmaxlen==1 */
  3797. if (uniq_tuple_length_arg >= CONVERT_IF_BIGGER_TO_BLOB)
  3798. using_unique_constraint= TRUE;
  3799. /* STEP 3: Allocate memory for temptable description */
  3800. init_sql_alloc(&own_root, "SJ_TMP_TABLE",
  3801. TABLE_ALLOC_BLOCK_SIZE, 0, MYF(MY_THREAD_SPECIFIC));
  3802. if (!multi_alloc_root(&own_root,
  3803. &table, sizeof(*table),
  3804. &share, sizeof(*share),
  3805. &reg_field, sizeof(Field*) * (1+1),
  3806. &blob_field, sizeof(uint)*2,
  3807. &keyinfo, sizeof(*keyinfo),
  3808. &key_part_info, sizeof(*key_part_info) * 2,
  3809. &start_recinfo,
  3810. sizeof(*recinfo)*(1*2+4),
  3811. &tmpname, (uint) strlen(path)+1,
  3812. &group_buff, (!using_unique_constraint ?
  3813. uniq_tuple_length_arg : 0),
  3814. &bitmaps, bitmap_buffer_size(1)*6,
  3815. NullS))
  3816. {
  3817. if (temp_pool_slot != MY_BIT_NONE)
  3818. bitmap_lock_clear_bit(&temp_pool, temp_pool_slot);
  3819. DBUG_RETURN(TRUE);
  3820. }
  3821. strmov(tmpname,path);
  3822. /* STEP 4: Create TABLE description */
  3823. bzero((char*) table,sizeof(*table));
  3824. bzero((char*) reg_field,sizeof(Field*)*2);
  3825. table->mem_root= own_root;
  3826. mem_root_save= thd->mem_root;
  3827. thd->mem_root= &table->mem_root;
  3828. table->field=reg_field;
  3829. table->alias.set("weedout-tmp", sizeof("weedout-tmp")-1,
  3830. table_alias_charset);
  3831. table->reginfo.lock_type=TL_WRITE; /* Will be updated */
  3832. table->db_stat=HA_OPEN_KEYFILE;
  3833. table->map=1;
  3834. table->temp_pool_slot = temp_pool_slot;
  3835. table->copy_blobs= 1;
  3836. table->in_use= thd;
  3837. table->s= share;
  3838. init_tmp_table_share(thd, share, "", 0, tmpname, tmpname);
  3839. share->blob_field= blob_field;
  3840. share->table_charset= NULL;
  3841. share->primary_key= MAX_KEY; // Indicate no primary key
  3842. /* Create the field */
  3843. {
  3844. LEX_CSTRING field_name= {STRING_WITH_LEN("rowids") };
  3845. /*
  3846. For the sake of uniformity, always use Field_varstring (altough we could
  3847. use Field_string for shorter keys)
  3848. */
  3849. field= new Field_varstring(uniq_tuple_length_arg, FALSE, &field_name,
  3850. share, &my_charset_bin);
  3851. if (!field)
  3852. DBUG_RETURN(0);
  3853. field->table= table;
  3854. field->key_start.clear_all();
  3855. field->part_of_key.clear_all();
  3856. field->part_of_sortkey.clear_all();
  3857. field->unireg_check= Field::NONE;
  3858. field->flags= (NOT_NULL_FLAG | BINARY_FLAG | NO_DEFAULT_VALUE_FLAG);
  3859. field->reset_fields();
  3860. field->init(table);
  3861. field->orig_table= NULL;
  3862. field->field_index= 0;
  3863. *(reg_field++)= field;
  3864. *blob_field= 0;
  3865. *reg_field= 0;
  3866. share->fields= 1;
  3867. share->blob_fields= 0;
  3868. }
  3869. uint reclength= field->pack_length();
  3870. if (using_unique_constraint)
  3871. {
  3872. share->db_plugin= ha_lock_engine(0, TMP_ENGINE_HTON);
  3873. table->file= get_new_handler(share, &table->mem_root,
  3874. share->db_type());
  3875. }
  3876. else
  3877. {
  3878. share->db_plugin= ha_lock_engine(0, heap_hton);
  3879. table->file= get_new_handler(share, &table->mem_root,
  3880. share->db_type());
  3881. DBUG_ASSERT(!table->file || uniq_tuple_length_arg <= table->file->max_key_length());
  3882. }
  3883. if (!table->file)
  3884. goto err;
  3885. if (table->file->set_ha_share_ref(&share->ha_share))
  3886. {
  3887. delete table->file;
  3888. goto err;
  3889. }
  3890. null_count=1;
  3891. null_pack_length= 1;
  3892. reclength += null_pack_length;
  3893. share->reclength= reclength;
  3894. {
  3895. uint alloc_length=ALIGN_SIZE(share->reclength + MI_UNIQUE_HASH_LENGTH+1);
  3896. share->rec_buff_length= alloc_length;
  3897. if (!(table->record[0]= (uchar*)
  3898. alloc_root(&table->mem_root, alloc_length*3)))
  3899. goto err;
  3900. table->record[1]= table->record[0]+alloc_length;
  3901. share->default_values= table->record[1]+alloc_length;
  3902. }
  3903. setup_tmp_table_column_bitmaps(table, bitmaps);
  3904. recinfo= start_recinfo;
  3905. null_flags=(uchar*) table->record[0];
  3906. pos=table->record[0]+ null_pack_length;
  3907. if (null_pack_length)
  3908. {
  3909. bzero((uchar*) recinfo,sizeof(*recinfo));
  3910. recinfo->type=FIELD_NORMAL;
  3911. recinfo->length=null_pack_length;
  3912. recinfo++;
  3913. bfill(null_flags,null_pack_length,255); // Set null fields
  3914. table->null_flags= (uchar*) table->record[0];
  3915. share->null_fields= null_count;
  3916. share->null_bytes= null_pack_length;
  3917. }
  3918. null_count=1;
  3919. {
  3920. //Field *field= *reg_field;
  3921. uint length;
  3922. bzero((uchar*) recinfo,sizeof(*recinfo));
  3923. field->move_field(pos,(uchar*) 0,0);
  3924. field->reset();
  3925. /*
  3926. Test if there is a default field value. The test for ->ptr is to skip
  3927. 'offset' fields generated by initalize_tables
  3928. */
  3929. // Initialize the table field:
  3930. bzero(field->ptr, field->pack_length());
  3931. length=field->pack_length();
  3932. pos+= length;
  3933. /* Make entry for create table */
  3934. recinfo->length=length;
  3935. if (field->flags & BLOB_FLAG)
  3936. recinfo->type= FIELD_BLOB;
  3937. else if (use_packed_rows &&
  3938. field->real_type() == MYSQL_TYPE_STRING &&
  3939. length >= MIN_STRING_LENGTH_TO_PACK_ROWS)
  3940. recinfo->type=FIELD_SKIP_ENDSPACE;
  3941. else
  3942. recinfo->type=FIELD_NORMAL;
  3943. field->set_table_name(&table->alias);
  3944. }
  3945. if (thd->variables.tmp_memory_table_size == ~ (ulonglong) 0) // No limit
  3946. share->max_rows= ~(ha_rows) 0;
  3947. else
  3948. share->max_rows= (ha_rows) (((share->db_type() == heap_hton) ?
  3949. MY_MIN(thd->variables.tmp_memory_table_size,
  3950. thd->variables.max_heap_table_size) :
  3951. thd->variables.tmp_memory_table_size) /
  3952. share->reclength);
  3953. set_if_bigger(share->max_rows,1); // For dummy start options
  3954. //// keyinfo= param->keyinfo;
  3955. if (TRUE)
  3956. {
  3957. DBUG_PRINT("info",("Creating group key in temporary table"));
  3958. share->keys=1;
  3959. share->uniques= MY_TEST(using_unique_constraint);
  3960. table->key_info=keyinfo;
  3961. keyinfo->key_part=key_part_info;
  3962. keyinfo->flags=HA_NOSAME;
  3963. keyinfo->usable_key_parts= keyinfo->user_defined_key_parts= 1;
  3964. keyinfo->key_length=0;
  3965. keyinfo->rec_per_key=0;
  3966. keyinfo->algorithm= HA_KEY_ALG_UNDEF;
  3967. keyinfo->name= weedout_key;
  3968. {
  3969. key_part_info->null_bit=0;
  3970. key_part_info->field= field;
  3971. key_part_info->offset= field->offset(table->record[0]);
  3972. key_part_info->length= (uint16) field->key_length();
  3973. key_part_info->type= (uint8) field->key_type();
  3974. key_part_info->key_type = FIELDFLAG_BINARY;
  3975. if (!using_unique_constraint)
  3976. {
  3977. if (!(key_field= field->new_key_field(thd->mem_root, table,
  3978. group_buff,
  3979. key_part_info->length,
  3980. field->null_ptr,
  3981. field->null_bit)))
  3982. goto err;
  3983. }
  3984. keyinfo->key_length+= key_part_info->length;
  3985. }
  3986. }
  3987. if (unlikely(thd->is_fatal_error)) // If end of memory
  3988. goto err;
  3989. share->db_record_offset= 1;
  3990. table->no_rows= 1; // We don't need the data
  3991. // recinfo must point after last field
  3992. recinfo++;
  3993. if (share->db_type() == TMP_ENGINE_HTON)
  3994. {
  3995. if (unlikely(create_internal_tmp_table(table, keyinfo, start_recinfo,
  3996. &recinfo, 0)))
  3997. goto err;
  3998. }
  3999. if (unlikely(open_tmp_table(table)))
  4000. goto err;
  4001. thd->mem_root= mem_root_save;
  4002. tmp_table= table;
  4003. DBUG_RETURN(FALSE);
  4004. err:
  4005. thd->mem_root= mem_root_save;
  4006. free_tmp_table(thd,table); /* purecov: inspected */
  4007. if (temp_pool_slot != MY_BIT_NONE)
  4008. bitmap_lock_clear_bit(&temp_pool, temp_pool_slot);
  4009. DBUG_RETURN(TRUE); /* purecov: inspected */
  4010. }
  4011. /*
  4012. SemiJoinDuplicateElimination: Reset the temporary table
  4013. */
  4014. int SJ_TMP_TABLE::sj_weedout_delete_rows()
  4015. {
  4016. DBUG_ENTER("SJ_TMP_TABLE::sj_weedout_delete_rows");
  4017. if (tmp_table)
  4018. {
  4019. int rc= tmp_table->file->ha_delete_all_rows();
  4020. DBUG_RETURN(rc);
  4021. }
  4022. have_degenerate_row= FALSE;
  4023. DBUG_RETURN(0);
  4024. }
  4025. /*
  4026. SemiJoinDuplicateElimination: Weed out duplicate row combinations
  4027. SYNPOSIS
  4028. sj_weedout_check_row()
  4029. thd Thread handle
  4030. DESCRIPTION
  4031. Try storing current record combination of outer tables (i.e. their
  4032. rowids) in the temporary table. This records the fact that we've seen
  4033. this record combination and also tells us if we've seen it before.
  4034. RETURN
  4035. -1 Error
  4036. 1 The row combination is a duplicate (discard it)
  4037. 0 The row combination is not a duplicate (continue)
  4038. */
  4039. int SJ_TMP_TABLE::sj_weedout_check_row(THD *thd)
  4040. {
  4041. int error;
  4042. SJ_TMP_TABLE::TAB *tab= tabs;
  4043. SJ_TMP_TABLE::TAB *tab_end= tabs_end;
  4044. uchar *ptr;
  4045. uchar *nulls_ptr;
  4046. DBUG_ENTER("SJ_TMP_TABLE::sj_weedout_check_row");
  4047. if (is_degenerate)
  4048. {
  4049. if (have_degenerate_row)
  4050. DBUG_RETURN(1);
  4051. have_degenerate_row= TRUE;
  4052. DBUG_RETURN(0);
  4053. }
  4054. ptr= tmp_table->record[0] + 1;
  4055. /* Put the the rowids tuple into table->record[0]: */
  4056. // 1. Store the length
  4057. if (((Field_varstring*)(tmp_table->field[0]))->length_bytes == 1)
  4058. {
  4059. *ptr= (uchar)(rowid_len + null_bytes);
  4060. ptr++;
  4061. }
  4062. else
  4063. {
  4064. int2store(ptr, rowid_len + null_bytes);
  4065. ptr += 2;
  4066. }
  4067. nulls_ptr= ptr;
  4068. // 2. Zero the null bytes
  4069. if (null_bytes)
  4070. {
  4071. bzero(ptr, null_bytes);
  4072. ptr += null_bytes;
  4073. }
  4074. // 3. Put the rowids
  4075. for (uint i=0; tab != tab_end; tab++, i++)
  4076. {
  4077. handler *h= tab->join_tab->table->file;
  4078. if (tab->join_tab->table->maybe_null && tab->join_tab->table->null_row)
  4079. {
  4080. /* It's a NULL-complemented row */
  4081. *(nulls_ptr + tab->null_byte) |= tab->null_bit;
  4082. bzero(ptr + tab->rowid_offset, h->ref_length);
  4083. }
  4084. else
  4085. {
  4086. /* Copy the rowid value */
  4087. memcpy(ptr + tab->rowid_offset, h->ref, h->ref_length);
  4088. }
  4089. }
  4090. error= tmp_table->file->ha_write_tmp_row(tmp_table->record[0]);
  4091. if (unlikely(error))
  4092. {
  4093. /* create_internal_tmp_table_from_heap will generate error if needed */
  4094. if (!tmp_table->file->is_fatal_error(error, HA_CHECK_DUP))
  4095. DBUG_RETURN(1); /* Duplicate */
  4096. bool is_duplicate;
  4097. if (create_internal_tmp_table_from_heap(thd, tmp_table, start_recinfo,
  4098. &recinfo, error, 1, &is_duplicate))
  4099. DBUG_RETURN(-1);
  4100. if (is_duplicate)
  4101. DBUG_RETURN(1);
  4102. }
  4103. DBUG_RETURN(0);
  4104. }
  4105. int init_dups_weedout(JOIN *join, uint first_table, int first_fanout_table, uint n_tables)
  4106. {
  4107. THD *thd= join->thd;
  4108. DBUG_ENTER("init_dups_weedout");
  4109. SJ_TMP_TABLE::TAB sjtabs[MAX_TABLES];
  4110. SJ_TMP_TABLE::TAB *last_tab= sjtabs;
  4111. uint jt_rowid_offset= 0; // # tuple bytes are already occupied (w/o NULL bytes)
  4112. uint jt_null_bits= 0; // # null bits in tuple bytes
  4113. /*
  4114. Walk through the range and remember
  4115. - tables that need their rowids to be put into temptable
  4116. - the last outer table
  4117. */
  4118. for (JOIN_TAB *j=join->join_tab + first_table;
  4119. j < join->join_tab + first_table + n_tables; j++)
  4120. {
  4121. if (sj_table_is_included(join, j))
  4122. {
  4123. last_tab->join_tab= j;
  4124. last_tab->rowid_offset= jt_rowid_offset;
  4125. jt_rowid_offset += j->table->file->ref_length;
  4126. if (j->table->maybe_null)
  4127. {
  4128. last_tab->null_byte= jt_null_bits / 8;
  4129. last_tab->null_bit= jt_null_bits++;
  4130. }
  4131. last_tab++;
  4132. j->table->prepare_for_position();
  4133. j->keep_current_rowid= TRUE;
  4134. }
  4135. }
  4136. SJ_TMP_TABLE *sjtbl;
  4137. if (jt_rowid_offset) /* Temptable has at least one rowid */
  4138. {
  4139. size_t tabs_size= (last_tab - sjtabs) * sizeof(SJ_TMP_TABLE::TAB);
  4140. if (!(sjtbl= (SJ_TMP_TABLE*)thd->alloc(sizeof(SJ_TMP_TABLE))) ||
  4141. !(sjtbl->tabs= (SJ_TMP_TABLE::TAB*) thd->alloc(tabs_size)))
  4142. DBUG_RETURN(TRUE); /* purecov: inspected */
  4143. memcpy(sjtbl->tabs, sjtabs, tabs_size);
  4144. sjtbl->is_degenerate= FALSE;
  4145. sjtbl->tabs_end= sjtbl->tabs + (last_tab - sjtabs);
  4146. sjtbl->rowid_len= jt_rowid_offset;
  4147. sjtbl->null_bits= jt_null_bits;
  4148. sjtbl->null_bytes= (jt_null_bits + 7)/8;
  4149. if (sjtbl->create_sj_weedout_tmp_table(thd))
  4150. DBUG_RETURN(TRUE);
  4151. join->sj_tmp_tables.push_back(sjtbl->tmp_table, thd->mem_root);
  4152. }
  4153. else
  4154. {
  4155. /*
  4156. This is a special case where the entire subquery predicate does
  4157. not depend on anything at all, ie this is
  4158. WHERE const IN (uncorrelated select)
  4159. */
  4160. if (!(sjtbl= (SJ_TMP_TABLE*)thd->alloc(sizeof(SJ_TMP_TABLE))))
  4161. DBUG_RETURN(TRUE); /* purecov: inspected */
  4162. sjtbl->tmp_table= NULL;
  4163. sjtbl->is_degenerate= TRUE;
  4164. sjtbl->have_degenerate_row= FALSE;
  4165. }
  4166. sjtbl->next_flush_table= join->join_tab[first_table].flush_weedout_table;
  4167. join->join_tab[first_table].flush_weedout_table= sjtbl;
  4168. join->join_tab[first_fanout_table].first_weedout_table= sjtbl;
  4169. join->join_tab[first_table + n_tables - 1].check_weed_out_table= sjtbl;
  4170. DBUG_RETURN(0);
  4171. }
  4172. /*
  4173. @brief
  4174. Set up semi-join Loose Scan strategy for execution
  4175. @detail
  4176. Other strategies are done in setup_semijoin_dups_elimination(),
  4177. however, we need to set up Loose Scan earlier, before make_join_select is
  4178. called. This is to prevent make_join_select() from switching full index
  4179. scans into quick selects (which will break Loose Scan access).
  4180. @return
  4181. 0 OK
  4182. 1 Error
  4183. */
  4184. int setup_semijoin_loosescan(JOIN *join)
  4185. {
  4186. uint i;
  4187. DBUG_ENTER("setup_semijoin_loosescan");
  4188. POSITION *pos= join->best_positions + join->const_tables;
  4189. for (i= join->const_tables ; i < join->top_join_tab_count; )
  4190. {
  4191. JOIN_TAB *tab=join->join_tab + i;
  4192. switch (pos->sj_strategy) {
  4193. case SJ_OPT_MATERIALIZE:
  4194. case SJ_OPT_MATERIALIZE_SCAN:
  4195. i+= 1; /* join tabs are embedded in the nest */
  4196. pos += pos->n_sj_tables;
  4197. break;
  4198. case SJ_OPT_LOOSE_SCAN:
  4199. {
  4200. /* We jump from the last table to the first one */
  4201. tab->loosescan_match_tab= tab + pos->n_sj_tables - 1;
  4202. /* LooseScan requires records to be produced in order */
  4203. if (tab->select && tab->select->quick)
  4204. tab->select->quick->need_sorted_output();
  4205. for (uint j= i; j < i + pos->n_sj_tables; j++)
  4206. join->join_tab[j].inside_loosescan_range= TRUE;
  4207. /* Calculate key length */
  4208. uint keylen= 0;
  4209. uint keyno= pos->loosescan_picker.loosescan_key;
  4210. for (uint kp=0; kp < pos->loosescan_picker.loosescan_parts; kp++)
  4211. keylen += tab->table->key_info[keyno].key_part[kp].store_length;
  4212. tab->loosescan_key= keyno;
  4213. tab->loosescan_key_len= keylen;
  4214. if (pos->n_sj_tables > 1)
  4215. tab[pos->n_sj_tables - 1].do_firstmatch= tab;
  4216. i+= pos->n_sj_tables;
  4217. pos+= pos->n_sj_tables;
  4218. break;
  4219. }
  4220. default:
  4221. {
  4222. i++;
  4223. pos++;
  4224. break;
  4225. }
  4226. }
  4227. }
  4228. DBUG_RETURN(FALSE);
  4229. }
  4230. /*
  4231. Setup the strategies to eliminate semi-join duplicates.
  4232. SYNOPSIS
  4233. setup_semijoin_dups_elimination()
  4234. join Join to process
  4235. options Join options (needed to see if join buffering will be
  4236. used or not)
  4237. no_jbuf_after Another bit of information re where join buffering will
  4238. be used.
  4239. DESCRIPTION
  4240. Setup the strategies to eliminate semi-join duplicates. ATM there are 4
  4241. strategies:
  4242. 1. DuplicateWeedout (use of temptable to remove duplicates based on rowids
  4243. of row combinations)
  4244. 2. FirstMatch (pick only the 1st matching row combination of inner tables)
  4245. 3. LooseScan (scanning the sj-inner table in a way that groups duplicates
  4246. together and picking the 1st one)
  4247. 4. SJ-Materialization.
  4248. The join order has "duplicate-generating ranges", and every range is
  4249. served by one strategy or a combination of FirstMatch with with some
  4250. other strategy.
  4251. "Duplicate-generating range" is defined as a range within the join order
  4252. that contains all of the inner tables of a semi-join. All ranges must be
  4253. disjoint, if tables of several semi-joins are interleaved, then the ranges
  4254. are joined together, which is equivalent to converting
  4255. SELECT ... WHERE oe1 IN (SELECT ie1 ...) AND oe2 IN (SELECT ie2 )
  4256. to
  4257. SELECT ... WHERE (oe1, oe2) IN (SELECT ie1, ie2 ... ...)
  4258. .
  4259. Applicability conditions are as follows:
  4260. DuplicateWeedout strategy
  4261. ~~~~~~~~~~~~~~~~~~~~~~~~~
  4262. (ot|nt)* [ it ((it|ot|nt)* (it|ot))] (nt)*
  4263. +------+ +=========================+ +---+
  4264. (1) (2) (3)
  4265. (1) - Prefix of OuterTables (those that participate in
  4266. IN-equality and/or are correlated with subquery) and outer
  4267. Non-correlated tables.
  4268. (2) - The handled range. The range starts with the first sj-inner
  4269. table, and covers all sj-inner and outer tables
  4270. Within the range, Inner, Outer, outer non-correlated tables
  4271. may follow in any order.
  4272. (3) - The suffix of outer non-correlated tables.
  4273. FirstMatch strategy
  4274. ~~~~~~~~~~~~~~~~~~~
  4275. (ot|nt)* [ it ((it|nt)* it) ] (nt)*
  4276. +------+ +==================+ +---+
  4277. (1) (2) (3)
  4278. (1) - Prefix of outer and non-correlated tables
  4279. (2) - The handled range, which may contain only inner and
  4280. non-correlated tables.
  4281. (3) - The suffix of outer non-correlated tables.
  4282. LooseScan strategy
  4283. ~~~~~~~~~~~~~~~~~~
  4284. (ot|ct|nt) [ loosescan_tbl (ot|nt|it)* it ] (ot|nt)*
  4285. +--------+ +===========+ +=============+ +------+
  4286. (1) (2) (3) (4)
  4287. (1) - Prefix that may contain any outer tables. The prefix must contain
  4288. all the non-trivially correlated outer tables. (non-trivially means
  4289. that the correlation is not just through the IN-equality).
  4290. (2) - Inner table for which the LooseScan scan is performed.
  4291. (3) - The remainder of the duplicate-generating range. It is served by
  4292. application of FirstMatch strategy, with the exception that
  4293. outer IN-correlated tables are considered to be non-correlated.
  4294. (4) - THe suffix of outer and outer non-correlated tables.
  4295. The choice between the strategies is made by the join optimizer (see
  4296. advance_sj_state() and fix_semijoin_strategies_for_picked_join_order()).
  4297. This function sets up all fields/structures/etc needed for execution except
  4298. for setup/initialization of semi-join materialization which is done in
  4299. setup_sj_materialization() (todo: can't we move that to here also?)
  4300. RETURN
  4301. FALSE OK
  4302. TRUE Out of memory error
  4303. */
  4304. int setup_semijoin_dups_elimination(JOIN *join, ulonglong options,
  4305. uint no_jbuf_after)
  4306. {
  4307. uint i;
  4308. DBUG_ENTER("setup_semijoin_dups_elimination");
  4309. join->complex_firstmatch_tables= table_map(0);
  4310. POSITION *pos= join->best_positions + join->const_tables;
  4311. for (i= join->const_tables ; i < join->top_join_tab_count; )
  4312. {
  4313. JOIN_TAB *tab=join->join_tab + i;
  4314. switch (pos->sj_strategy) {
  4315. case SJ_OPT_MATERIALIZE:
  4316. case SJ_OPT_MATERIALIZE_SCAN:
  4317. /* Do nothing */
  4318. i+= 1;// It used to be pos->n_sj_tables, but now they are embedded in a nest
  4319. pos += pos->n_sj_tables;
  4320. break;
  4321. case SJ_OPT_LOOSE_SCAN:
  4322. {
  4323. /* Setup already handled by setup_semijoin_loosescan */
  4324. i+= pos->n_sj_tables;
  4325. pos+= pos->n_sj_tables;
  4326. break;
  4327. }
  4328. case SJ_OPT_DUPS_WEEDOUT:
  4329. {
  4330. /*
  4331. Check for join buffering. If there is one, move the first table
  4332. forwards, but do not destroy other duplicate elimination methods.
  4333. */
  4334. uint first_table= i;
  4335. uint join_cache_level= join->thd->variables.join_cache_level;
  4336. for (uint j= i; j < i + pos->n_sj_tables; j++)
  4337. {
  4338. /*
  4339. When we'll properly take join buffering into account during
  4340. join optimization, the below check should be changed to
  4341. "if (join->best_positions[j].use_join_buffer &&
  4342. j <= no_jbuf_after)".
  4343. For now, use a rough criteria:
  4344. */
  4345. JOIN_TAB *js_tab=join->join_tab + j;
  4346. if (j != join->const_tables && js_tab->use_quick != 2 &&
  4347. j <= no_jbuf_after &&
  4348. ((js_tab->type == JT_ALL && join_cache_level != 0) ||
  4349. (join_cache_level > 2 && (js_tab->type == JT_REF ||
  4350. js_tab->type == JT_EQ_REF))))
  4351. {
  4352. /* Looks like we'll be using join buffer */
  4353. first_table= join->const_tables;
  4354. /*
  4355. Make sure that possible sorting of rows from the head table
  4356. is not to be employed.
  4357. */
  4358. if (join->get_sort_by_join_tab())
  4359. {
  4360. join->simple_order= 0;
  4361. join->simple_group= 0;
  4362. join->need_tmp= join->test_if_need_tmp_table();
  4363. }
  4364. break;
  4365. }
  4366. }
  4367. init_dups_weedout(join, first_table, i, i + pos->n_sj_tables - first_table);
  4368. i+= pos->n_sj_tables;
  4369. pos+= pos->n_sj_tables;
  4370. break;
  4371. }
  4372. case SJ_OPT_FIRST_MATCH:
  4373. {
  4374. JOIN_TAB *j;
  4375. JOIN_TAB *jump_to= tab-1;
  4376. bool complex_range= FALSE;
  4377. table_map tables_in_range= table_map(0);
  4378. for (j= tab; j != tab + pos->n_sj_tables; j++)
  4379. {
  4380. tables_in_range |= j->table->map;
  4381. if (!j->emb_sj_nest)
  4382. {
  4383. /*
  4384. Got a table that's not within any semi-join nest. This is a case
  4385. like this:
  4386. SELECT * FROM ot1, nt1 WHERE ot1.col IN (SELECT expr FROM it1, it2)
  4387. with a join order of
  4388. +----- FirstMatch range ----+
  4389. | |
  4390. ot1 it1 nt1 nt2 it2 it3 ...
  4391. | ^
  4392. | +-------- 'j' points here
  4393. +------------- SJ_OPT_FIRST_MATCH was set for this table as
  4394. it's the first one that produces duplicates
  4395. */
  4396. DBUG_ASSERT(j != tab); /* table ntX must have an itX before it */
  4397. /*
  4398. If the table right before us is an inner table (like it1 in the
  4399. picture), it should be set to jump back to previous outer-table
  4400. */
  4401. if (j[-1].emb_sj_nest)
  4402. j[-1].do_firstmatch= jump_to;
  4403. jump_to= j; /* Jump back to us */
  4404. complex_range= TRUE;
  4405. }
  4406. else
  4407. {
  4408. j->first_sj_inner_tab= tab;
  4409. j->last_sj_inner_tab= tab + pos->n_sj_tables - 1;
  4410. }
  4411. }
  4412. j[-1].do_firstmatch= jump_to;
  4413. i+= pos->n_sj_tables;
  4414. pos+= pos->n_sj_tables;
  4415. if (complex_range)
  4416. join->complex_firstmatch_tables|= tables_in_range;
  4417. break;
  4418. }
  4419. case SJ_OPT_NONE:
  4420. i++;
  4421. pos++;
  4422. break;
  4423. }
  4424. }
  4425. DBUG_RETURN(FALSE);
  4426. }
  4427. /*
  4428. Destroy all temporary tables created by NL-semijoin runtime
  4429. */
  4430. void destroy_sj_tmp_tables(JOIN *join)
  4431. {
  4432. List_iterator<TABLE> it(join->sj_tmp_tables);
  4433. TABLE *table;
  4434. while ((table= it++))
  4435. {
  4436. /*
  4437. SJ-Materialization tables are initialized for either sequential reading
  4438. or index lookup, DuplicateWeedout tables are not initialized for read
  4439. (we only write to them), so need to call ha_index_or_rnd_end.
  4440. */
  4441. table->file->ha_index_or_rnd_end();
  4442. free_tmp_table(join->thd, table);
  4443. }
  4444. join->sj_tmp_tables.empty();
  4445. join->sjm_info_list.empty();
  4446. }
  4447. /*
  4448. Remove all records from all temp tables used by NL-semijoin runtime
  4449. SYNOPSIS
  4450. clear_sj_tmp_tables()
  4451. join The join to remove tables for
  4452. DESCRIPTION
  4453. Remove all records from all temp tables used by NL-semijoin runtime. This
  4454. must be done before every join re-execution.
  4455. */
  4456. int clear_sj_tmp_tables(JOIN *join)
  4457. {
  4458. int res;
  4459. List_iterator<TABLE> it(join->sj_tmp_tables);
  4460. TABLE *table;
  4461. while ((table= it++))
  4462. {
  4463. if ((res= table->file->ha_delete_all_rows()))
  4464. return res; /* purecov: inspected */
  4465. }
  4466. SJ_MATERIALIZATION_INFO *sjm;
  4467. List_iterator<SJ_MATERIALIZATION_INFO> it2(join->sjm_info_list);
  4468. while ((sjm= it2++))
  4469. {
  4470. sjm->materialized= FALSE;
  4471. }
  4472. return 0;
  4473. }
  4474. /*
  4475. Check if the table's rowid is included in the temptable
  4476. SYNOPSIS
  4477. sj_table_is_included()
  4478. join The join
  4479. join_tab The table to be checked
  4480. DESCRIPTION
  4481. SemiJoinDuplicateElimination: check the table's rowid should be included
  4482. in the temptable. This is so if
  4483. 1. The table is not embedded within some semi-join nest
  4484. 2. The has been pulled out of a semi-join nest, or
  4485. 3. The table is functionally dependent on some previous table
  4486. [4. This is also true for constant tables that can't be
  4487. NULL-complemented but this function is not called for such tables]
  4488. RETURN
  4489. TRUE - Include table's rowid
  4490. FALSE - Don't
  4491. */
  4492. static bool sj_table_is_included(JOIN *join, JOIN_TAB *join_tab)
  4493. {
  4494. if (join_tab->emb_sj_nest)
  4495. return FALSE;
  4496. /* Check if this table is functionally dependent on the tables that
  4497. are within the same outer join nest
  4498. */
  4499. TABLE_LIST *embedding= join_tab->table->pos_in_table_list->embedding;
  4500. if (join_tab->type == JT_EQ_REF)
  4501. {
  4502. table_map depends_on= 0;
  4503. uint idx;
  4504. for (uint kp= 0; kp < join_tab->ref.key_parts; kp++)
  4505. depends_on |= join_tab->ref.items[kp]->used_tables();
  4506. Table_map_iterator it(depends_on & ~PSEUDO_TABLE_BITS);
  4507. while ((idx= it.next_bit())!=Table_map_iterator::BITMAP_END)
  4508. {
  4509. JOIN_TAB *ref_tab= join->map2table[idx];
  4510. if (embedding != ref_tab->table->pos_in_table_list->embedding)
  4511. return TRUE;
  4512. }
  4513. /* Ok, functionally dependent */
  4514. return FALSE;
  4515. }
  4516. /* Not functionally dependent => need to include*/
  4517. return TRUE;
  4518. }
  4519. /*
  4520. Index lookup-based subquery: save some flags for EXPLAIN output
  4521. SYNOPSIS
  4522. save_index_subquery_explain_info()
  4523. join_tab Subquery's join tab (there is only one as index lookup is
  4524. only used for subqueries that are single-table SELECTs)
  4525. where Subquery's WHERE clause
  4526. DESCRIPTION
  4527. For index lookup-based subquery (i.e. one executed with
  4528. subselect_uniquesubquery_engine or subselect_indexsubquery_engine),
  4529. check its EXPLAIN output row should contain
  4530. "Using index" (TAB_INFO_FULL_SCAN_ON_NULL)
  4531. "Using Where" (TAB_INFO_USING_WHERE)
  4532. "Full scan on NULL key" (TAB_INFO_FULL_SCAN_ON_NULL)
  4533. and set appropriate flags in join_tab->packed_info.
  4534. */
  4535. static void save_index_subquery_explain_info(JOIN_TAB *join_tab, Item* where)
  4536. {
  4537. join_tab->packed_info= TAB_INFO_HAVE_VALUE;
  4538. if (join_tab->table->covering_keys.is_set(join_tab->ref.key))
  4539. join_tab->packed_info |= TAB_INFO_USING_INDEX;
  4540. if (where)
  4541. join_tab->packed_info |= TAB_INFO_USING_WHERE;
  4542. for (uint i = 0; i < join_tab->ref.key_parts; i++)
  4543. {
  4544. if (join_tab->ref.cond_guards[i])
  4545. {
  4546. join_tab->packed_info |= TAB_INFO_FULL_SCAN_ON_NULL;
  4547. break;
  4548. }
  4549. }
  4550. }
  4551. /*
  4552. Check if the join can be rewritten to [unique_]indexsubquery_engine
  4553. DESCRIPTION
  4554. Check if the join can be changed into [unique_]indexsubquery_engine.
  4555. The check is done after join optimization, the idea is that if the join
  4556. has only one table and uses a [eq_]ref access generated from subselect's
  4557. IN-equality then we replace it with a subselect_indexsubquery_engine or a
  4558. subselect_uniquesubquery_engine.
  4559. RETURN
  4560. 0 - Ok, rewrite done (stop join optimization and return)
  4561. 1 - Fatal error (stop join optimization and return)
  4562. -1 - No rewrite performed, continue with join optimization
  4563. */
  4564. int rewrite_to_index_subquery_engine(JOIN *join)
  4565. {
  4566. THD *thd= join->thd;
  4567. JOIN_TAB* join_tab=join->join_tab;
  4568. SELECT_LEX_UNIT *unit= join->unit;
  4569. DBUG_ENTER("rewrite_to_index_subquery_engine");
  4570. /*
  4571. is this simple IN subquery?
  4572. */
  4573. /* TODO: In order to use these more efficient subquery engines in more cases,
  4574. the following problems need to be solved:
  4575. - the code that removes GROUP BY (group_list), also adds an ORDER BY
  4576. (order), thus GROUP BY queries (almost?) never pass through this branch.
  4577. Solution: remove the test below '!join->order', because we remove the
  4578. ORDER clase for subqueries anyway.
  4579. - in order to set a more efficient engine, the optimizer needs to both
  4580. decide to remove GROUP BY, *and* select one of the JT_[EQ_]REF[_OR_NULL]
  4581. access methods, *and* loose scan should be more expensive or
  4582. inapliccable. When is that possible?
  4583. - Consider expanding the applicability of this rewrite for loose scan
  4584. for group by queries.
  4585. */
  4586. if (!join->group_list && !join->order &&
  4587. join->unit->item &&
  4588. join->unit->item->substype() == Item_subselect::IN_SUBS &&
  4589. join->table_count == 1 && join->conds &&
  4590. !join->unit->is_unit_op())
  4591. {
  4592. if (!join->having)
  4593. {
  4594. Item *where= join->conds;
  4595. if (join_tab[0].type == JT_EQ_REF &&
  4596. join_tab[0].ref.items[0]->name.str == in_left_expr_name.str)
  4597. {
  4598. remove_subq_pushed_predicates(join, &where);
  4599. save_index_subquery_explain_info(join_tab, where);
  4600. join_tab[0].type= JT_UNIQUE_SUBQUERY;
  4601. join->error= 0;
  4602. DBUG_RETURN(unit->item->
  4603. change_engine(new
  4604. subselect_uniquesubquery_engine(thd,
  4605. join_tab,
  4606. unit->item,
  4607. where)));
  4608. }
  4609. else if (join_tab[0].type == JT_REF &&
  4610. join_tab[0].ref.items[0]->name.str == in_left_expr_name.str)
  4611. {
  4612. remove_subq_pushed_predicates(join, &where);
  4613. save_index_subquery_explain_info(join_tab, where);
  4614. join_tab[0].type= JT_INDEX_SUBQUERY;
  4615. join->error= 0;
  4616. DBUG_RETURN(unit->item->
  4617. change_engine(new
  4618. subselect_indexsubquery_engine(thd,
  4619. join_tab,
  4620. unit->item,
  4621. where,
  4622. NULL,
  4623. 0)));
  4624. }
  4625. } else if (join_tab[0].type == JT_REF_OR_NULL &&
  4626. join_tab[0].ref.items[0]->name.str == in_left_expr_name.str &&
  4627. join->having->name.str == in_having_cond.str)
  4628. {
  4629. join_tab[0].type= JT_INDEX_SUBQUERY;
  4630. join->error= 0;
  4631. join->conds= remove_additional_cond(join->conds);
  4632. save_index_subquery_explain_info(join_tab, join->conds);
  4633. DBUG_RETURN(unit->item->
  4634. change_engine(new subselect_indexsubquery_engine(thd,
  4635. join_tab,
  4636. unit->item,
  4637. join->conds,
  4638. join->having,
  4639. 1)));
  4640. }
  4641. }
  4642. DBUG_RETURN(-1); /* Haven't done the rewrite */
  4643. }
  4644. /**
  4645. Remove additional condition inserted by IN/ALL/ANY transformation.
  4646. @param conds condition for processing
  4647. @return
  4648. new conditions
  4649. */
  4650. static Item *remove_additional_cond(Item* conds)
  4651. {
  4652. if (conds->name.str == in_additional_cond.str)
  4653. return 0;
  4654. if (conds->type() == Item::COND_ITEM)
  4655. {
  4656. Item_cond *cnd= (Item_cond*) conds;
  4657. List_iterator<Item> li(*(cnd->argument_list()));
  4658. Item *item;
  4659. while ((item= li++))
  4660. {
  4661. if (item->name.str == in_additional_cond.str)
  4662. {
  4663. li.remove();
  4664. if (cnd->argument_list()->elements == 1)
  4665. return cnd->argument_list()->head();
  4666. return conds;
  4667. }
  4668. }
  4669. }
  4670. return conds;
  4671. }
  4672. /*
  4673. Remove the predicates pushed down into the subquery
  4674. SYNOPSIS
  4675. remove_subq_pushed_predicates()
  4676. where IN Must be NULL
  4677. OUT The remaining WHERE condition, or NULL
  4678. DESCRIPTION
  4679. Given that this join will be executed using (unique|index)_subquery,
  4680. without "checking NULL", remove the predicates that were pushed down
  4681. into the subquery.
  4682. If the subquery compares scalar values, we can remove the condition that
  4683. was wrapped into trig_cond (it will be checked when needed by the subquery
  4684. engine)
  4685. If the subquery compares row values, we need to keep the wrapped
  4686. equalities in the WHERE clause: when the left (outer) tuple has both NULL
  4687. and non-NULL values, we'll do a full table scan and will rely on the
  4688. equalities corresponding to non-NULL parts of left tuple to filter out
  4689. non-matching records.
  4690. TODO: We can remove the equalities that will be guaranteed to be true by the
  4691. fact that subquery engine will be using index lookup. This must be done only
  4692. for cases where there are no conversion errors of significance, e.g. 257
  4693. that is searched in a byte. But this requires homogenization of the return
  4694. codes of all Field*::store() methods.
  4695. */
  4696. static void remove_subq_pushed_predicates(JOIN *join, Item **where)
  4697. {
  4698. if (join->conds->type() == Item::FUNC_ITEM &&
  4699. ((Item_func *)join->conds)->functype() == Item_func::EQ_FUNC &&
  4700. ((Item_func *)join->conds)->arguments()[0]->type() == Item::REF_ITEM &&
  4701. ((Item_func *)join->conds)->arguments()[1]->type() == Item::FIELD_ITEM &&
  4702. test_if_ref (join->conds,
  4703. (Item_field *)((Item_func *)join->conds)->arguments()[1],
  4704. ((Item_func *)join->conds)->arguments()[0]))
  4705. {
  4706. *where= 0;
  4707. return;
  4708. }
  4709. }
  4710. /**
  4711. Optimize all subqueries of a query that were not flattened into a semijoin.
  4712. @details
  4713. Optimize all immediate children subqueries of a query.
  4714. This phase must be called after substitute_for_best_equal_field() because
  4715. that function may replace items with other items from a multiple equality,
  4716. and we need to reference the correct items in the index access method of the
  4717. IN predicate.
  4718. @return Operation status
  4719. @retval FALSE success.
  4720. @retval TRUE error occurred.
  4721. */
  4722. bool JOIN::optimize_unflattened_subqueries()
  4723. {
  4724. return select_lex->optimize_unflattened_subqueries(false);
  4725. }
  4726. /**
  4727. Optimize all constant subqueries of a query that were not flattened into
  4728. a semijoin.
  4729. @details
  4730. Similar to other constant conditions, constant subqueries can be used in
  4731. various constant optimizations. Having optimized constant subqueries before
  4732. these constant optimizations, makes it possible to estimate if a subquery
  4733. is "cheap" enough to be executed during the optimization phase.
  4734. Constant subqueries can be optimized and evaluated independent of the outer
  4735. query, therefore if const_only = true, this method can be called early in
  4736. the optimization phase of the outer query.
  4737. @return Operation status
  4738. @retval FALSE success.
  4739. @retval TRUE error occurred.
  4740. */
  4741. bool JOIN::optimize_constant_subqueries()
  4742. {
  4743. ulonglong save_options= select_lex->options;
  4744. bool res;
  4745. /*
  4746. Constant subqueries may be executed during the optimization phase.
  4747. In EXPLAIN mode the optimizer doesn't initialize many of the data structures
  4748. needed for execution. In order to make it possible to execute subqueries
  4749. during optimization, constant subqueries must be optimized for execution,
  4750. not for EXPLAIN.
  4751. */
  4752. select_lex->options&= ~SELECT_DESCRIBE;
  4753. res= select_lex->optimize_unflattened_subqueries(true);
  4754. select_lex->options= save_options;
  4755. return res;
  4756. }
  4757. /*
  4758. Join tab execution startup function.
  4759. SYNOPSIS
  4760. join_tab_execution_startup()
  4761. tab Join tab to perform startup actions for
  4762. DESCRIPTION
  4763. Join tab execution startup function. This is different from
  4764. tab->read_first_record in the regard that this has actions that are to be
  4765. done once per join execution.
  4766. Currently there are only two possible startup functions, so we have them
  4767. both here inside if (...) branches. In future we could switch to function
  4768. pointers.
  4769. TODO: consider moving this together with JOIN_TAB::preread_init
  4770. RETURN
  4771. NESTED_LOOP_OK - OK
  4772. NESTED_LOOP_ERROR| NESTED_LOOP_KILLED - Error, abort the join execution
  4773. */
  4774. enum_nested_loop_state join_tab_execution_startup(JOIN_TAB *tab)
  4775. {
  4776. Item_in_subselect *in_subs;
  4777. DBUG_ENTER("join_tab_execution_startup");
  4778. if (tab->table->pos_in_table_list &&
  4779. (in_subs= tab->table->pos_in_table_list->jtbm_subselect))
  4780. {
  4781. /* It's a non-merged SJM nest */
  4782. DBUG_ASSERT(in_subs->engine->engine_type() ==
  4783. subselect_engine::HASH_SJ_ENGINE);
  4784. subselect_hash_sj_engine *hash_sj_engine=
  4785. ((subselect_hash_sj_engine*)in_subs->engine);
  4786. if (!hash_sj_engine->is_materialized)
  4787. {
  4788. hash_sj_engine->materialize_join->exec();
  4789. hash_sj_engine->is_materialized= TRUE;
  4790. if (unlikely(hash_sj_engine->materialize_join->error) ||
  4791. unlikely(tab->join->thd->is_fatal_error))
  4792. DBUG_RETURN(NESTED_LOOP_ERROR);
  4793. }
  4794. }
  4795. else if (tab->bush_children)
  4796. {
  4797. /* It's a merged SJM nest */
  4798. enum_nested_loop_state rc;
  4799. SJ_MATERIALIZATION_INFO *sjm= tab->bush_children->start->emb_sj_nest->sj_mat_info;
  4800. if (!sjm->materialized)
  4801. {
  4802. JOIN *join= tab->join;
  4803. JOIN_TAB *join_tab= tab->bush_children->start;
  4804. JOIN_TAB *save_return_tab= join->return_tab;
  4805. /*
  4806. Now run the join for the inner tables. The first call is to run the
  4807. join, the second one is to signal EOF (this is essential for some
  4808. join strategies, e.g. it will make join buffering flush the records)
  4809. */
  4810. if ((rc= sub_select(join, join_tab, FALSE/* no EOF */)) < 0 ||
  4811. (rc= sub_select(join, join_tab, TRUE/* now EOF */)) < 0)
  4812. {
  4813. join->return_tab= save_return_tab;
  4814. DBUG_RETURN(rc); /* it's NESTED_LOOP_(ERROR|KILLED)*/
  4815. }
  4816. join->return_tab= save_return_tab;
  4817. sjm->materialized= TRUE;
  4818. }
  4819. }
  4820. DBUG_RETURN(NESTED_LOOP_OK);
  4821. }
  4822. /*
  4823. Create a dummy temporary table, useful only for the sake of having a
  4824. TABLE* object with map,tablenr and maybe_null properties.
  4825. This is used by non-mergeable semi-join materilization code to handle
  4826. degenerate cases where materialized subquery produced "Impossible WHERE"
  4827. and thus wasn't materialized.
  4828. */
  4829. TABLE *create_dummy_tmp_table(THD *thd)
  4830. {
  4831. DBUG_ENTER("create_dummy_tmp_table");
  4832. TABLE *table;
  4833. TMP_TABLE_PARAM sjm_table_param;
  4834. sjm_table_param.init();
  4835. sjm_table_param.field_count= 1;
  4836. List<Item> sjm_table_cols;
  4837. const LEX_CSTRING dummy_name= { STRING_WITH_LEN("dummy") };
  4838. Item *column_item= new (thd->mem_root) Item_int(thd, 1);
  4839. if (!column_item)
  4840. DBUG_RETURN(NULL);
  4841. sjm_table_cols.push_back(column_item, thd->mem_root);
  4842. if (!(table= create_tmp_table(thd, &sjm_table_param,
  4843. sjm_table_cols, (ORDER*) 0,
  4844. TRUE /* distinct */,
  4845. 1, /*save_sum_fields*/
  4846. thd->variables.option_bits |
  4847. TMP_TABLE_ALL_COLUMNS,
  4848. HA_POS_ERROR /*rows_limit */,
  4849. &dummy_name, TRUE /* Do not open */)))
  4850. {
  4851. DBUG_RETURN(NULL);
  4852. }
  4853. DBUG_RETURN(table);
  4854. }
  4855. /*
  4856. A class that is used to catch one single tuple that is sent to the join
  4857. output, and save it in Item_cache element(s).
  4858. It is very similar to select_singlerow_subselect but doesn't require a
  4859. Item_singlerow_subselect item.
  4860. */
  4861. class select_value_catcher :public select_subselect
  4862. {
  4863. public:
  4864. select_value_catcher(THD *thd_arg, Item_subselect *item_arg):
  4865. select_subselect(thd_arg, item_arg)
  4866. {}
  4867. int send_data(List<Item> &items);
  4868. int setup(List<Item> *items);
  4869. bool assigned; /* TRUE <=> we've caught a value */
  4870. uint n_elements; /* How many elements we get */
  4871. Item_cache **row; /* Array of cache elements */
  4872. };
  4873. int select_value_catcher::setup(List<Item> *items)
  4874. {
  4875. assigned= FALSE;
  4876. n_elements= items->elements;
  4877. if (!(row= (Item_cache**) thd->alloc(sizeof(Item_cache*) * n_elements)))
  4878. return TRUE;
  4879. Item *sel_item;
  4880. List_iterator<Item> li(*items);
  4881. for (uint i= 0; (sel_item= li++); i++)
  4882. {
  4883. if (!(row[i]= sel_item->get_cache(thd)))
  4884. return TRUE;
  4885. row[i]->setup(thd, sel_item);
  4886. }
  4887. return FALSE;
  4888. }
  4889. int select_value_catcher::send_data(List<Item> &items)
  4890. {
  4891. DBUG_ENTER("select_value_catcher::send_data");
  4892. DBUG_ASSERT(!assigned);
  4893. DBUG_ASSERT(items.elements == n_elements);
  4894. if (unit->offset_limit_cnt)
  4895. { // Using limit offset,count
  4896. unit->offset_limit_cnt--;
  4897. DBUG_RETURN(0);
  4898. }
  4899. Item *val_item;
  4900. List_iterator_fast<Item> li(items);
  4901. for (uint i= 0; (val_item= li++); i++)
  4902. {
  4903. row[i]->store(val_item);
  4904. row[i]->cache_value();
  4905. }
  4906. assigned= TRUE;
  4907. DBUG_RETURN(0);
  4908. }
  4909. /**
  4910. @brief
  4911. Attach conditions to already optimized condition
  4912. @param thd the thread handle
  4913. @param cond the condition to which add new conditions
  4914. @param cond_eq IN/OUT the multiple equalities of cond
  4915. @param new_conds the list of conditions to be added
  4916. @param cond_value the returned value of the condition
  4917. if it can be evaluated
  4918. @details
  4919. The method creates new condition through union of cond and
  4920. the conditions from new_conds list.
  4921. The method is called after optimize_cond() for cond. The result
  4922. of the union should be the same as if it was done before the
  4923. the optimize_cond() call.
  4924. @retval otherwise the created condition
  4925. @retval NULL if an error occurs
  4926. */
  4927. Item *and_new_conditions_to_optimized_cond(THD *thd, Item *cond,
  4928. COND_EQUAL **cond_eq,
  4929. List<Item> &new_conds,
  4930. Item::cond_result *cond_value)
  4931. {
  4932. COND_EQUAL new_cond_equal;
  4933. Item *item;
  4934. Item_equal *mult_eq;
  4935. bool is_simplified_cond= false;
  4936. /* The list where parts of the new condition are stored. */
  4937. List_iterator<Item> li(new_conds);
  4938. List_iterator_fast<Item_equal> it(new_cond_equal.current_level);
  4939. /*
  4940. Create multiple equalities from the equalities of the list new_conds.
  4941. Save the created multiple equalities in new_cond_equal.
  4942. If multiple equality can't be created or the condition
  4943. from new_conds list isn't an equality leave it in new_conds
  4944. list.
  4945. The equality can't be converted into the multiple equality if it
  4946. is a knowingly false or true equality.
  4947. For example, (3 = 1) equality.
  4948. */
  4949. while ((item=li++))
  4950. {
  4951. if (item->type() == Item::FUNC_ITEM &&
  4952. ((Item_func *) item)->functype() == Item_func::EQ_FUNC &&
  4953. check_simple_equality(thd,
  4954. Item::Context(Item::ANY_SUBST,
  4955. ((Item_func_equal *)item)->compare_type_handler(),
  4956. ((Item_func_equal *)item)->compare_collation()),
  4957. ((Item_func *)item)->arguments()[0],
  4958. ((Item_func *)item)->arguments()[1],
  4959. &new_cond_equal))
  4960. li.remove();
  4961. }
  4962. it.rewind();
  4963. if (cond && cond->type() == Item::COND_ITEM &&
  4964. ((Item_cond*) cond)->functype() == Item_func::COND_AND_FUNC)
  4965. {
  4966. /*
  4967. Case when cond is an AND-condition.
  4968. Union AND-condition cond, created multiple equalities from
  4969. new_cond_equal and remaining conditions from new_conds.
  4970. */
  4971. COND_EQUAL *cond_equal= &((Item_cond_and *) cond)->m_cond_equal;
  4972. List<Item_equal> *cond_equalities= &cond_equal->current_level;
  4973. List<Item> *and_args= ((Item_cond_and *)cond)->argument_list();
  4974. /*
  4975. Disjoin multiple equalities of cond.
  4976. Merge these multiple equalities with the multiple equalities of
  4977. new_cond_equal. Save the result in new_cond_equal.
  4978. Check if after the merge some multiple equalities are knowingly
  4979. true or false.
  4980. */
  4981. and_args->disjoin((List<Item> *) cond_equalities);
  4982. while ((mult_eq= it++))
  4983. {
  4984. mult_eq->upper_levels= 0;
  4985. mult_eq->merge_into_list(thd, cond_equalities, false, false);
  4986. }
  4987. List_iterator_fast<Item_equal> ei(*cond_equalities);
  4988. while ((mult_eq= ei++))
  4989. {
  4990. if (mult_eq->const_item() && !mult_eq->val_int())
  4991. is_simplified_cond= true;
  4992. else
  4993. {
  4994. mult_eq->unfix_fields();
  4995. if (mult_eq->fix_fields(thd, NULL))
  4996. return NULL;
  4997. }
  4998. }
  4999. li.rewind();
  5000. while ((item=li++))
  5001. {
  5002. /*
  5003. There still can be some equalities at not top level of new_conds
  5004. conditions that are not transformed into multiple equalities.
  5005. To transform them build_item_equal() is called.
  5006. Examples of not top level equalities:
  5007. 1. (t1.a = 3) OR (t1.b > 5)
  5008. (t1.a = 3) - not top level equality.
  5009. It is inside OR condition
  5010. 2. ((t3.d = t3.c) AND (t3.c < 15)) OR (t3.d > 1)
  5011. (t1.d = t3.c) - not top level equality.
  5012. It is inside AND condition which is a part of OR condition
  5013. */
  5014. if (item->type() == Item::COND_ITEM &&
  5015. ((Item_cond *)item)->functype() == Item_func::COND_OR_FUNC)
  5016. {
  5017. item= item->build_equal_items(thd,
  5018. &((Item_cond_and *) cond)->m_cond_equal,
  5019. false, NULL);
  5020. }
  5021. and_args->push_back(item, thd->mem_root);
  5022. }
  5023. and_args->append((List<Item> *) cond_equalities);
  5024. *cond_eq= &((Item_cond_and *) cond)->m_cond_equal;
  5025. }
  5026. else
  5027. {
  5028. /*
  5029. Case when cond isn't an AND-condition or is NULL.
  5030. There can be several cases:
  5031. 1. cond is a multiple equality.
  5032. In this case merge cond with the multiple equalities of
  5033. new_cond_equal.
  5034. Create new condition from the created multiple equalities
  5035. and new_conds list conditions.
  5036. 2. cond is NULL
  5037. Create new condition from new_conds list conditions
  5038. and multiple equalities from new_cond_equal.
  5039. 3. Otherwise
  5040. Create new condition through union of cond, conditions from new_conds
  5041. list and created multiple equalities from new_cond_equal.
  5042. */
  5043. List<Item> new_conds_list;
  5044. /* Flag is set to true if cond is a multiple equality */
  5045. bool is_mult_eq= (cond && cond->type() == Item::FUNC_ITEM &&
  5046. ((Item_func*) cond)->functype() == Item_func::MULT_EQUAL_FUNC);
  5047. /*
  5048. If cond is non-empty and is not multiple equality save it as
  5049. a part of a new condition.
  5050. */
  5051. if (cond && !is_mult_eq &&
  5052. new_conds_list.push_back(cond, thd->mem_root))
  5053. return NULL;
  5054. /*
  5055. If cond is a multiple equality merge it with new_cond_equal
  5056. multiple equalities.
  5057. */
  5058. if (is_mult_eq)
  5059. {
  5060. Item_equal *eq_cond= (Item_equal *)cond;
  5061. eq_cond->upper_levels= 0;
  5062. eq_cond->merge_into_list(thd, &new_cond_equal.current_level,
  5063. false, false);
  5064. }
  5065. /**
  5066. Fix created multiple equalities and check if they are knowingly
  5067. true or false.
  5068. */
  5069. List_iterator_fast<Item_equal> ei(new_cond_equal.current_level);
  5070. while ((mult_eq=ei++))
  5071. {
  5072. if (mult_eq->const_item() && !mult_eq->val_int())
  5073. is_simplified_cond= true;
  5074. else
  5075. {
  5076. mult_eq->unfix_fields();
  5077. if (mult_eq->fix_fields(thd, NULL))
  5078. return NULL;
  5079. }
  5080. }
  5081. /*
  5082. Create AND condition if new condition will have two or
  5083. more elements.
  5084. */
  5085. Item_cond_and *and_cond= 0;
  5086. COND_EQUAL *inherited= 0;
  5087. if (new_conds_list.elements +
  5088. new_conds.elements +
  5089. new_cond_equal.current_level.elements > 1)
  5090. {
  5091. and_cond= new (thd->mem_root) Item_cond_and(thd);
  5092. and_cond->m_cond_equal.copy(new_cond_equal);
  5093. inherited= &and_cond->m_cond_equal;
  5094. }
  5095. li.rewind();
  5096. while ((item=li++))
  5097. {
  5098. /*
  5099. Look for the comment in the case when cond is an
  5100. AND condition above the build_equal_items() call.
  5101. */
  5102. if (item->type() == Item::COND_ITEM &&
  5103. ((Item_cond *)item)->functype() == Item_func::COND_OR_FUNC)
  5104. {
  5105. item= item->build_equal_items(thd, inherited, false, NULL);
  5106. }
  5107. new_conds_list.push_back(item, thd->mem_root);
  5108. }
  5109. new_conds_list.append((List<Item> *)&new_cond_equal.current_level);
  5110. if (and_cond)
  5111. {
  5112. and_cond->argument_list()->append(&new_conds_list);
  5113. cond= (Item *)and_cond;
  5114. *cond_eq= &((Item_cond_and *) cond)->m_cond_equal;
  5115. }
  5116. else
  5117. {
  5118. List_iterator_fast<Item> iter(new_conds_list);
  5119. cond= iter++;
  5120. if (cond->type() == Item::FUNC_ITEM &&
  5121. ((Item_func *)cond)->functype() == Item_func::MULT_EQUAL_FUNC)
  5122. {
  5123. if (!(*cond_eq))
  5124. *cond_eq= new COND_EQUAL();
  5125. (*cond_eq)->copy(new_cond_equal);
  5126. }
  5127. else
  5128. *cond_eq= 0;
  5129. }
  5130. }
  5131. if (!cond)
  5132. return NULL;
  5133. if (*cond_eq)
  5134. {
  5135. /*
  5136. The multiple equalities are attached only to the upper level
  5137. of AND-condition cond.
  5138. Push them down to the bottom levels of cond AND-condition if needed.
  5139. */
  5140. propagate_new_equalities(thd, cond,
  5141. &(*cond_eq)->current_level,
  5142. 0,
  5143. &is_simplified_cond);
  5144. cond= cond->propagate_equal_fields(thd,
  5145. Item::Context_boolean(),
  5146. *cond_eq);
  5147. cond->update_used_tables();
  5148. }
  5149. /* Check if conds has knowingly true or false parts. */
  5150. if (cond &&
  5151. !is_simplified_cond &&
  5152. cond->walk(&Item::is_simplified_cond_processor, 0, 0))
  5153. is_simplified_cond= true;
  5154. /*
  5155. If it was found that there are some knowingly true or false equalities
  5156. remove them from cond and set cond_value to the appropriate value.
  5157. */
  5158. if (cond && is_simplified_cond)
  5159. cond= cond->remove_eq_conds(thd, cond_value, true);
  5160. if (cond && cond->fix_fields_if_needed(thd, NULL))
  5161. return NULL;
  5162. return cond;
  5163. }
  5164. /**
  5165. @brief Materialize a degenerate jtbm semi join
  5166. @param thd thread handler
  5167. @param tbl table list for the target jtbm semi join table
  5168. @param subq_pred IN subquery predicate with the degenerate jtbm semi join
  5169. @param eq_list IN/OUT the list where to add produced equalities
  5170. @details
  5171. The method materializes the degenerate jtbm semi join for the
  5172. subquery from the IN subquery predicate subq_pred taking table
  5173. as the target for materialization.
  5174. Any degenerate table is guaranteed to produce 0 or 1 record.
  5175. Examples of both cases:
  5176. select * from ot where col in (select ... from it where 2>3)
  5177. select * from ot where col in (select MY_MIN(it.key) from it)
  5178. in this case, there is no necessity to create a temp.table for
  5179. materialization.
  5180. We now just need to
  5181. 1. Check whether 1 or 0 records are produced, setup this as a
  5182. constant join tab.
  5183. 2. Create a dummy temporary table, because all of the join
  5184. optimization code relies on TABLE object being present.
  5185. In the case when materialization produces one row the function
  5186. additionally creates equalities between the expressions from the
  5187. left part of the IN subquery predicate and the corresponding
  5188. columns of the produced row. These equalities are added to the
  5189. list eq_list. They are supposed to be conjuncted with the condition
  5190. of the WHERE clause.
  5191. @retval TRUE if an error occurs
  5192. @retval FALSE otherwise
  5193. */
  5194. bool execute_degenerate_jtbm_semi_join(THD *thd,
  5195. TABLE_LIST *tbl,
  5196. Item_in_subselect *subq_pred,
  5197. List<Item> &eq_list)
  5198. {
  5199. DBUG_ENTER("execute_degenerate_jtbm_semi_join");
  5200. select_value_catcher *new_sink;
  5201. DBUG_ASSERT(subq_pred->engine->engine_type() ==
  5202. subselect_engine::SINGLE_SELECT_ENGINE);
  5203. subselect_single_select_engine *engine=
  5204. (subselect_single_select_engine*)subq_pred->engine;
  5205. if (!(new_sink= new (thd->mem_root) select_value_catcher(thd, subq_pred)))
  5206. DBUG_RETURN(TRUE);
  5207. if (new_sink->setup(&engine->select_lex->join->fields_list) ||
  5208. engine->select_lex->join->change_result(new_sink, NULL) ||
  5209. engine->exec())
  5210. {
  5211. DBUG_RETURN(TRUE);
  5212. }
  5213. subq_pred->is_jtbm_const_tab= TRUE;
  5214. if (new_sink->assigned)
  5215. {
  5216. /*
  5217. Subselect produced one row, which is saved in new_sink->row.
  5218. Save "left_expr[i] == row[i]" equalities into the eq_list.
  5219. */
  5220. subq_pred->jtbm_const_row_found= TRUE;
  5221. Item *eq_cond;
  5222. for (uint i= 0; i < subq_pred->left_expr->cols(); i++)
  5223. {
  5224. eq_cond=
  5225. new (thd->mem_root) Item_func_eq(thd,
  5226. subq_pred->left_expr->element_index(i),
  5227. new_sink->row[i]);
  5228. if (!eq_cond || eq_cond->fix_fields(thd, NULL) ||
  5229. eq_list.push_back(eq_cond, thd->mem_root))
  5230. DBUG_RETURN(TRUE);
  5231. }
  5232. }
  5233. else
  5234. {
  5235. /* Subselect produced no rows. Just set the flag */
  5236. subq_pred->jtbm_const_row_found= FALSE;
  5237. }
  5238. TABLE *dummy_table;
  5239. if (!(dummy_table= create_dummy_tmp_table(thd)))
  5240. DBUG_RETURN(TRUE);
  5241. tbl->table= dummy_table;
  5242. tbl->table->pos_in_table_list= tbl;
  5243. /*
  5244. Note: the table created above may be freed by:
  5245. 1. JOIN_TAB::cleanup(), when the parent join is a regular join.
  5246. 2. cleanup_empty_jtbm_semi_joins(), when the parent join is a
  5247. degenerate join (e.g. one with "Impossible where").
  5248. */
  5249. setup_table_map(tbl->table, tbl, tbl->jtbm_table_no);
  5250. DBUG_RETURN(FALSE);
  5251. }
  5252. /**
  5253. @brief
  5254. Execute degenerate jtbm semi joins before optimize_cond() for parent
  5255. @param join the parent join for jtbm semi joins
  5256. @param join_list the list of tables where jtbm semi joins are processed
  5257. @param eq_list IN/OUT the list where to add equalities produced after
  5258. materialization of single-row degenerate jtbm semi joins
  5259. @details
  5260. The method traverses join_list trying to find any degenerate jtbm semi
  5261. joins for subqueries of IN predicates. For each degenerate jtbm
  5262. semi join execute_degenerate_jtbm_semi_join() is called. As a result
  5263. of this call new equalities that substitute for single-row materialized
  5264. jtbm semi join are added to eq_list.
  5265. In the case when a table is nested in another table 'nested_join' the
  5266. method is recursively called for the join_list of the 'nested_join' trying
  5267. to find in the list any degenerate jtbm semi joins. Currently a jtbm semi
  5268. join may occur in a mergeable semi join nest.
  5269. @retval TRUE if an error occurs
  5270. @retval FALSE otherwise
  5271. */
  5272. bool setup_degenerate_jtbm_semi_joins(JOIN *join,
  5273. List<TABLE_LIST> *join_list,
  5274. List<Item> &eq_list)
  5275. {
  5276. TABLE_LIST *table;
  5277. NESTED_JOIN *nested_join;
  5278. List_iterator<TABLE_LIST> li(*join_list);
  5279. THD *thd= join->thd;
  5280. DBUG_ENTER("setup_degenerate_jtbm_semi_joins");
  5281. while ((table= li++))
  5282. {
  5283. Item_in_subselect *subq_pred;
  5284. if ((subq_pred= table->jtbm_subselect))
  5285. {
  5286. JOIN *subq_join= subq_pred->unit->first_select()->join;
  5287. if (!subq_join->tables_list || !subq_join->table_count)
  5288. {
  5289. if (execute_degenerate_jtbm_semi_join(thd,
  5290. table,
  5291. subq_pred,
  5292. eq_list))
  5293. DBUG_RETURN(TRUE);
  5294. join->is_orig_degenerated= true;
  5295. }
  5296. }
  5297. if ((nested_join= table->nested_join))
  5298. {
  5299. if (setup_degenerate_jtbm_semi_joins(join,
  5300. &nested_join->join_list,
  5301. eq_list))
  5302. DBUG_RETURN(TRUE);
  5303. }
  5304. }
  5305. DBUG_RETURN(FALSE);
  5306. }
  5307. /**
  5308. @brief
  5309. Optimize jtbm semi joins for materialization
  5310. @param join the parent join for jtbm semi joins
  5311. @param join_list the list of TABLE_LIST objects where jtbm semi join
  5312. can occur
  5313. @param eq_list IN/OUT the list where to add produced equalities
  5314. @details
  5315. This method is called by the optimizer after the call of
  5316. optimize_cond() for parent select.
  5317. The method traverses join_list trying to find any jtbm semi joins for
  5318. subqueries from IN predicates and optimizes them.
  5319. After the optimization some of jtbm semi joins may become degenerate.
  5320. For example the subquery 'SELECT MAX(b) FROM t2' from the query
  5321. SELECT * FROM t1 WHERE 4 IN (SELECT MAX(b) FROM t2);
  5322. will become degenerate if there is an index on t2.b.
  5323. If a subquery becomes degenerate it is handled by the function
  5324. execute_degenerate_jtbm_semi_join().
  5325. Otherwise the method creates a temporary table in which the subquery
  5326. of the jtbm semi join will be materialied.
  5327. The function saves the equalities between all pairs of the expressions
  5328. from the left part of the IN subquery predicate and the corresponding
  5329. columns of the subquery from the predicate in eq_list appending them
  5330. to the list. The equalities of eq_list will be later conjucted with the
  5331. condition of the WHERE clause.
  5332. In the case when a table is nested in another table 'nested_join' the
  5333. method is recursively called for the join_list of the 'nested_join' trying
  5334. to find in the list any degenerate jtbm semi joins. Currently a jtbm semi
  5335. join may occur in a mergeable semi join nest.
  5336. @retval TRUE if an error occurs
  5337. @retval FALSE otherwise
  5338. */
  5339. bool setup_jtbm_semi_joins(JOIN *join, List<TABLE_LIST> *join_list,
  5340. List<Item> &eq_list)
  5341. {
  5342. TABLE_LIST *table;
  5343. NESTED_JOIN *nested_join;
  5344. List_iterator<TABLE_LIST> li(*join_list);
  5345. THD *thd= join->thd;
  5346. DBUG_ENTER("setup_jtbm_semi_joins");
  5347. while ((table= li++))
  5348. {
  5349. Item_in_subselect *subq_pred;
  5350. if ((subq_pred= table->jtbm_subselect))
  5351. {
  5352. double rows;
  5353. double read_time;
  5354. /*
  5355. Perform optimization of the subquery, so that we know estimated
  5356. - cost of materialization process
  5357. - how many records will be in the materialized temp.table
  5358. */
  5359. if (subq_pred->optimize(&rows, &read_time))
  5360. DBUG_RETURN(TRUE);
  5361. subq_pred->jtbm_read_time= read_time;
  5362. subq_pred->jtbm_record_count=rows;
  5363. JOIN *subq_join= subq_pred->unit->first_select()->join;
  5364. if (!subq_join->tables_list || !subq_join->table_count)
  5365. {
  5366. if (!join->is_orig_degenerated &&
  5367. execute_degenerate_jtbm_semi_join(thd, table, subq_pred,
  5368. eq_list))
  5369. DBUG_RETURN(TRUE);
  5370. }
  5371. else
  5372. {
  5373. DBUG_ASSERT(subq_pred->test_set_strategy(SUBS_MATERIALIZATION));
  5374. subq_pred->is_jtbm_const_tab= FALSE;
  5375. subselect_hash_sj_engine *hash_sj_engine=
  5376. ((subselect_hash_sj_engine*)subq_pred->engine);
  5377. table->table= hash_sj_engine->tmp_table;
  5378. table->table->pos_in_table_list= table;
  5379. setup_table_map(table->table, table, table->jtbm_table_no);
  5380. List_iterator<Item> li(*hash_sj_engine->semi_join_conds->argument_list());
  5381. Item *item;
  5382. while ((item=li++))
  5383. {
  5384. item->update_used_tables();
  5385. if (eq_list.push_back(item, thd->mem_root))
  5386. DBUG_RETURN(TRUE);
  5387. }
  5388. }
  5389. table->table->maybe_null= MY_TEST(join->mixed_implicit_grouping);
  5390. }
  5391. if ((nested_join= table->nested_join))
  5392. {
  5393. if (setup_jtbm_semi_joins(join, &nested_join->join_list, eq_list))
  5394. DBUG_RETURN(TRUE);
  5395. }
  5396. }
  5397. DBUG_RETURN(FALSE);
  5398. }
  5399. /*
  5400. Cleanup non-merged semi-joins (JBMs) that have empty.
  5401. This function is to cleanups for a special case:
  5402. Consider a query like
  5403. select * from t1 where 1=2 AND t1.col IN (select max(..) ... having 1=2)
  5404. For this query, optimization of subquery will short-circuit, and
  5405. setup_jtbm_semi_joins() will call create_dummy_tmp_table() so that we have
  5406. empty, constant temp.table to stand in as materialized temp. table.
  5407. Now, suppose that the upper join is also found to be degenerate. In that
  5408. case, no JOIN_TAB array will be produced, and hence, JOIN::cleanup() will
  5409. have a problem with cleaning up empty JTBMs (non-empty ones are cleaned up
  5410. through Item::cleanup() calls).
  5411. */
  5412. void cleanup_empty_jtbm_semi_joins(JOIN *join, List<TABLE_LIST> *join_list)
  5413. {
  5414. List_iterator<TABLE_LIST> li(*join_list);
  5415. TABLE_LIST *table;
  5416. while ((table= li++))
  5417. {
  5418. if ((table->jtbm_subselect && table->jtbm_subselect->is_jtbm_const_tab))
  5419. {
  5420. if (table->table)
  5421. {
  5422. free_tmp_table(join->thd, table->table);
  5423. table->table= NULL;
  5424. }
  5425. }
  5426. else if (table->nested_join && table->sj_subq_pred)
  5427. {
  5428. cleanup_empty_jtbm_semi_joins(join, &table->nested_join->join_list);
  5429. }
  5430. }
  5431. }
  5432. /**
  5433. Choose an optimal strategy to execute an IN/ALL/ANY subquery predicate
  5434. based on cost.
  5435. @param join_tables the set of tables joined in the subquery
  5436. @notes
  5437. The method chooses between the materialization and IN=>EXISTS rewrite
  5438. strategies for the execution of a non-flattened subquery IN predicate.
  5439. The cost-based decision is made as follows:
  5440. 1. compute materialize_strategy_cost based on the unmodified subquery
  5441. 2. reoptimize the subquery taking into account the IN-EXISTS predicates
  5442. 3. compute in_exists_strategy_cost based on the reoptimized plan
  5443. 4. compare and set the cheaper strategy
  5444. if (materialize_strategy_cost >= in_exists_strategy_cost)
  5445. in_strategy = MATERIALIZATION
  5446. else
  5447. in_strategy = IN_TO_EXISTS
  5448. 5. if in_strategy = MATERIALIZATION and it is not possible to initialize it
  5449. revert to IN_TO_EXISTS
  5450. 6. if (in_strategy == MATERIALIZATION)
  5451. revert the subquery plan to the original one before reoptimizing
  5452. else
  5453. inject the IN=>EXISTS predicates into the new EXISTS subquery plan
  5454. The implementation itself is a bit more complicated because it takes into
  5455. account two more factors:
  5456. - whether the user allowed both strategies through an optimizer_switch, and
  5457. - if materialization was the cheaper strategy, whether it can be executed
  5458. or not.
  5459. @retval FALSE success.
  5460. @retval TRUE error occurred.
  5461. */
  5462. bool JOIN::choose_subquery_plan(table_map join_tables)
  5463. {
  5464. enum_reopt_result reopt_result= REOPT_NONE;
  5465. Item_in_subselect *in_subs;
  5466. /*
  5467. IN/ALL/ANY optimizations are not applicable for so called fake select
  5468. (this select exists only to filter results of union if it is needed).
  5469. */
  5470. if (select_lex == select_lex->master_unit()->fake_select_lex)
  5471. return 0;
  5472. if (is_in_subquery())
  5473. {
  5474. in_subs= (Item_in_subselect*) unit->item;
  5475. if (in_subs->create_in_to_exists_cond(this))
  5476. return true;
  5477. }
  5478. else
  5479. return false;
  5480. /* A strategy must be chosen earlier. */
  5481. DBUG_ASSERT(in_subs->has_strategy());
  5482. DBUG_ASSERT(in_to_exists_where || in_to_exists_having);
  5483. DBUG_ASSERT(!in_to_exists_where || in_to_exists_where->is_fixed());
  5484. DBUG_ASSERT(!in_to_exists_having || in_to_exists_having->is_fixed());
  5485. /* The original QEP of the subquery. */
  5486. Join_plan_state save_qep(table_count);
  5487. /*
  5488. Compute and compare the costs of materialization and in-exists if both
  5489. strategies are possible and allowed by the user (checked during the prepare
  5490. phase.
  5491. */
  5492. if (in_subs->test_strategy(SUBS_MATERIALIZATION) &&
  5493. in_subs->test_strategy(SUBS_IN_TO_EXISTS))
  5494. {
  5495. JOIN *outer_join;
  5496. JOIN *inner_join= this;
  5497. /* Number of unique value combinations filtered by the IN predicate. */
  5498. double outer_lookup_keys;
  5499. /* Cost and row count of the unmodified subquery. */
  5500. double inner_read_time_1, inner_record_count_1;
  5501. /* Cost of the subquery with injected IN-EXISTS predicates. */
  5502. double inner_read_time_2;
  5503. /* The cost to compute IN via materialization. */
  5504. double materialize_strategy_cost;
  5505. /* The cost of the IN->EXISTS strategy. */
  5506. double in_exists_strategy_cost;
  5507. double dummy;
  5508. /*
  5509. A. Estimate the number of rows of the outer table that will be filtered
  5510. by the IN predicate.
  5511. */
  5512. outer_join= unit->outer_select() ? unit->outer_select()->join : NULL;
  5513. /*
  5514. Get the cost of the outer join if:
  5515. (1) It has at least one table, and
  5516. (2) It has been already optimized (if there is no join_tab, then the
  5517. outer join has not been optimized yet).
  5518. */
  5519. if (outer_join && outer_join->table_count > 0 && // (1)
  5520. outer_join->join_tab && // (2)
  5521. !in_subs->const_item())
  5522. {
  5523. /*
  5524. TODO:
  5525. Currently outer_lookup_keys is computed as the number of rows in
  5526. the partial join including the JOIN_TAB where the IN predicate is
  5527. pushed to. In the general case this is a gross overestimate because
  5528. due to caching we are interested only in the number of unique keys.
  5529. The search key may be formed by columns from much fewer than all
  5530. tables in the partial join. Example:
  5531. select * from t1, t2 where t1.c1 = t2.key AND t2.c2 IN (select ...);
  5532. If the join order: t1, t2, the number of unique lookup keys is ~ to
  5533. the number of unique values t2.c2 in the partial join t1 join t2.
  5534. */
  5535. outer_join->get_partial_cost_and_fanout(in_subs->get_join_tab_idx(),
  5536. table_map(-1),
  5537. &dummy,
  5538. &outer_lookup_keys);
  5539. }
  5540. else
  5541. {
  5542. /*
  5543. TODO: outer_join can be NULL for DELETE statements.
  5544. How to compute its cost?
  5545. */
  5546. outer_lookup_keys= 1;
  5547. }
  5548. /*
  5549. B. Estimate the cost and number of records of the subquery both
  5550. unmodified, and with injected IN->EXISTS predicates.
  5551. */
  5552. inner_read_time_1= inner_join->best_read;
  5553. inner_record_count_1= inner_join->join_record_count;
  5554. if (in_to_exists_where && const_tables != table_count)
  5555. {
  5556. /*
  5557. Re-optimize and cost the subquery taking into account the IN-EXISTS
  5558. conditions.
  5559. */
  5560. reopt_result= reoptimize(in_to_exists_where, join_tables, &save_qep);
  5561. if (reopt_result == REOPT_ERROR)
  5562. return TRUE;
  5563. /* Get the cost of the modified IN-EXISTS plan. */
  5564. inner_read_time_2= inner_join->best_read;
  5565. }
  5566. else
  5567. {
  5568. /* Reoptimization would not produce any better plan. */
  5569. inner_read_time_2= inner_read_time_1;
  5570. }
  5571. /*
  5572. C. Compute execution costs.
  5573. */
  5574. /* C.1 Compute the cost of the materialization strategy. */
  5575. //uint rowlen= get_tmp_table_rec_length(unit->first_select()->item_list);
  5576. uint rowlen= get_tmp_table_rec_length(ref_ptrs,
  5577. select_lex->item_list.elements);
  5578. /* The cost of writing one row into the temporary table. */
  5579. double write_cost= get_tmp_table_write_cost(thd, inner_record_count_1,
  5580. rowlen);
  5581. /* The cost of a lookup into the unique index of the materialized table. */
  5582. double lookup_cost= get_tmp_table_lookup_cost(thd, inner_record_count_1,
  5583. rowlen);
  5584. /*
  5585. The cost of executing the subquery and storing its result in an indexed
  5586. temporary table.
  5587. */
  5588. double materialization_cost= COST_ADD(inner_read_time_1,
  5589. COST_MULT(write_cost,
  5590. inner_record_count_1));
  5591. materialize_strategy_cost= COST_ADD(materialization_cost,
  5592. COST_MULT(outer_lookup_keys,
  5593. lookup_cost));
  5594. /* C.2 Compute the cost of the IN=>EXISTS strategy. */
  5595. in_exists_strategy_cost= COST_MULT(outer_lookup_keys, inner_read_time_2);
  5596. /* C.3 Compare the costs and choose the cheaper strategy. */
  5597. if (materialize_strategy_cost >= in_exists_strategy_cost)
  5598. in_subs->set_strategy(SUBS_IN_TO_EXISTS);
  5599. else
  5600. in_subs->set_strategy(SUBS_MATERIALIZATION);
  5601. DBUG_PRINT("info",
  5602. ("mat_strategy_cost: %.2f, mat_cost: %.2f, write_cost: %.2f, lookup_cost: %.2f",
  5603. materialize_strategy_cost, materialization_cost, write_cost, lookup_cost));
  5604. DBUG_PRINT("info",
  5605. ("inx_strategy_cost: %.2f, inner_read_time_2: %.2f",
  5606. in_exists_strategy_cost, inner_read_time_2));
  5607. DBUG_PRINT("info",("outer_lookup_keys: %.2f", outer_lookup_keys));
  5608. }
  5609. /*
  5610. If (1) materialization is a possible strategy based on semantic analysis
  5611. during the prepare phase, then if
  5612. (2) it is more expensive than the IN->EXISTS transformation, and
  5613. (3) it is not possible to create usable indexes for the materialization
  5614. strategy,
  5615. fall back to IN->EXISTS.
  5616. otherwise
  5617. use materialization.
  5618. */
  5619. if (in_subs->test_strategy(SUBS_MATERIALIZATION) &&
  5620. in_subs->setup_mat_engine())
  5621. {
  5622. /*
  5623. If materialization was the cheaper or the only user-selected strategy,
  5624. but it is not possible to execute it due to limitations in the
  5625. implementation, fall back to IN-TO-EXISTS.
  5626. */
  5627. in_subs->set_strategy(SUBS_IN_TO_EXISTS);
  5628. }
  5629. if (in_subs->test_strategy(SUBS_MATERIALIZATION))
  5630. {
  5631. /* Restore the original query plan used for materialization. */
  5632. if (reopt_result == REOPT_NEW_PLAN)
  5633. restore_query_plan(&save_qep);
  5634. in_subs->unit->uncacheable&= ~UNCACHEABLE_DEPENDENT_INJECTED;
  5635. select_lex->uncacheable&= ~UNCACHEABLE_DEPENDENT_INJECTED;
  5636. /*
  5637. Reset the "LIMIT 1" set in Item_exists_subselect::fix_length_and_dec.
  5638. TODO:
  5639. Currently we set the subquery LIMIT to infinity, and this is correct
  5640. because we forbid at parse time LIMIT inside IN subqueries (see
  5641. Item_in_subselect::test_limit). However, once we allow this, here
  5642. we should set the correct limit if given in the query.
  5643. */
  5644. in_subs->unit->global_parameters()->select_limit= NULL;
  5645. in_subs->unit->set_limit(unit->global_parameters());
  5646. /*
  5647. Set the limit of this JOIN object as well, because normally its being
  5648. set in the beginning of JOIN::optimize, which was already done.
  5649. */
  5650. select_limit= in_subs->unit->select_limit_cnt;
  5651. }
  5652. else if (in_subs->test_strategy(SUBS_IN_TO_EXISTS))
  5653. {
  5654. if (reopt_result == REOPT_NONE && in_to_exists_where &&
  5655. const_tables != table_count)
  5656. {
  5657. /*
  5658. The subquery was not reoptimized with the newly injected IN-EXISTS
  5659. conditions either because the user allowed only the IN-EXISTS strategy,
  5660. or because materialization was not possible based on semantic analysis.
  5661. */
  5662. reopt_result= reoptimize(in_to_exists_where, join_tables, NULL);
  5663. if (reopt_result == REOPT_ERROR)
  5664. return TRUE;
  5665. }
  5666. if (in_subs->inject_in_to_exists_cond(this))
  5667. return TRUE;
  5668. /*
  5669. If the injected predicate is correlated the IN->EXISTS transformation
  5670. make the subquery dependent.
  5671. */
  5672. if ((in_to_exists_where &&
  5673. in_to_exists_where->used_tables() & OUTER_REF_TABLE_BIT) ||
  5674. (in_to_exists_having &&
  5675. in_to_exists_having->used_tables() & OUTER_REF_TABLE_BIT))
  5676. {
  5677. in_subs->unit->uncacheable|= UNCACHEABLE_DEPENDENT_INJECTED;
  5678. select_lex->uncacheable|= UNCACHEABLE_DEPENDENT_INJECTED;
  5679. }
  5680. select_limit= 1;
  5681. }
  5682. else
  5683. DBUG_ASSERT(FALSE);
  5684. return FALSE;
  5685. }
  5686. /**
  5687. Choose a query plan for a table-less subquery.
  5688. @notes
  5689. @retval FALSE success.
  5690. @retval TRUE error occurred.
  5691. */
  5692. bool JOIN::choose_tableless_subquery_plan()
  5693. {
  5694. DBUG_ASSERT(!tables_list || !table_count);
  5695. if (unit->item)
  5696. {
  5697. DBUG_ASSERT(unit->item->type() == Item::SUBSELECT_ITEM);
  5698. Item_subselect *subs_predicate= unit->item;
  5699. /*
  5700. If the optimizer determined that his query has an empty result,
  5701. in most cases the subquery predicate is a known constant value -
  5702. either of TRUE, FALSE or NULL. The implementation of
  5703. Item_subselect::no_rows_in_result() determines which one.
  5704. */
  5705. if (zero_result_cause)
  5706. {
  5707. if (!implicit_grouping)
  5708. {
  5709. /*
  5710. Both group by queries and non-group by queries without aggregate
  5711. functions produce empty subquery result. There is no need to further
  5712. rewrite the subquery because it will not be executed at all.
  5713. */
  5714. exec_const_cond= 0;
  5715. return FALSE;
  5716. }
  5717. /* @todo
  5718. A further optimization is possible when a non-group query with
  5719. MIN/MAX/COUNT is optimized by opt_sum_query. Then, if there are
  5720. only MIN/MAX functions over an empty result set, the subquery
  5721. result is a NULL value/row, thus the value of subs_predicate is
  5722. NULL.
  5723. */
  5724. }
  5725. /*
  5726. For IN subqueries, use IN->EXISTS transfomation, unless the subquery
  5727. has been converted to a JTBM semi-join. In that case, just leave
  5728. everything as-is, setup_jtbm_semi_joins() has special handling for cases
  5729. like this.
  5730. */
  5731. if (subs_predicate->is_in_predicate() &&
  5732. !(subs_predicate->substype() == Item_subselect::IN_SUBS &&
  5733. ((Item_in_subselect*)subs_predicate)->is_jtbm_merged))
  5734. {
  5735. Item_in_subselect *in_subs;
  5736. in_subs= (Item_in_subselect*) subs_predicate;
  5737. in_subs->set_strategy(SUBS_IN_TO_EXISTS);
  5738. if (in_subs->create_in_to_exists_cond(this) ||
  5739. in_subs->inject_in_to_exists_cond(this))
  5740. return TRUE;
  5741. tmp_having= having;
  5742. }
  5743. }
  5744. exec_const_cond= zero_result_cause ? 0 : conds;
  5745. return FALSE;
  5746. }
  5747. bool Item::pushable_equality_checker_for_subquery(uchar *arg)
  5748. {
  5749. return
  5750. get_corresponding_field_pair(this,
  5751. ((Item_in_subselect *)arg)->corresponding_fields);
  5752. }
  5753. /*
  5754. Checks if 'item' or some item equal to it is equal to the field from
  5755. some Field_pair of 'pair_list' and returns matching Field_pair or
  5756. NULL if the matching Field_pair wasn't found.
  5757. */
  5758. Field_pair *find_matching_field_pair(Item *item, List<Field_pair> pair_list)
  5759. {
  5760. Field_pair *field_pair= get_corresponding_field_pair(item, pair_list);
  5761. if (field_pair)
  5762. return field_pair;
  5763. Item_equal *item_equal= item->get_item_equal();
  5764. if (item_equal)
  5765. {
  5766. Item_equal_fields_iterator it(*item_equal);
  5767. Item *equal_item;
  5768. while ((equal_item= it++))
  5769. {
  5770. if (equal_item->const_item())
  5771. continue;
  5772. field_pair= get_corresponding_field_pair(equal_item, pair_list);
  5773. if (field_pair)
  5774. return field_pair;
  5775. }
  5776. }
  5777. return NULL;
  5778. }
  5779. bool Item_field::excl_dep_on_in_subq_left_part(Item_in_subselect *subq_pred)
  5780. {
  5781. if (find_matching_field_pair(((Item *) this), subq_pred->corresponding_fields))
  5782. return true;
  5783. return false;
  5784. }
  5785. bool Item_direct_view_ref::excl_dep_on_in_subq_left_part(Item_in_subselect *subq_pred)
  5786. {
  5787. if (item_equal)
  5788. {
  5789. DBUG_ASSERT(real_item()->type() == Item::FIELD_ITEM);
  5790. if (get_corresponding_field_pair(((Item *)this), subq_pred->corresponding_fields))
  5791. return true;
  5792. }
  5793. return (*ref)->excl_dep_on_in_subq_left_part(subq_pred);
  5794. }
  5795. bool Item_equal::excl_dep_on_in_subq_left_part(Item_in_subselect *subq_pred)
  5796. {
  5797. Item *left_item = get_const();
  5798. Item_equal_fields_iterator it(*this);
  5799. Item *item;
  5800. if (!left_item)
  5801. {
  5802. while ((item=it++))
  5803. {
  5804. if (item->excl_dep_on_in_subq_left_part(subq_pred))
  5805. {
  5806. left_item= item;
  5807. break;
  5808. }
  5809. }
  5810. }
  5811. if (!left_item)
  5812. return false;
  5813. while ((item=it++))
  5814. {
  5815. if (item->excl_dep_on_in_subq_left_part(subq_pred))
  5816. return true;
  5817. }
  5818. return false;
  5819. }
  5820. /**
  5821. @brief
  5822. Get corresponding item from the select of the right part of IN subquery
  5823. @param thd the thread handle
  5824. @param item the item from the left part of subq_pred for which
  5825. corresponding item should be found
  5826. @param subq_pred the IN subquery predicate
  5827. @details
  5828. This method looks through the fields of the select of the right part of
  5829. the IN subquery predicate subq_pred trying to find the corresponding
  5830. item 'new_item' for item. If item has equal items it looks through
  5831. the fields of the select of the right part of subq_pred for each equal
  5832. item trying to find the corresponding item.
  5833. The method assumes that the given item is either a field item or
  5834. a reference to a field item.
  5835. @retval <item*> reference to the corresponding item
  5836. @retval NULL if item was not found
  5837. */
  5838. static
  5839. Item *get_corresponding_item(THD *thd, Item *item,
  5840. Item_in_subselect *subq_pred)
  5841. {
  5842. DBUG_ASSERT(item->type() == Item::FIELD_ITEM ||
  5843. (item->type() == Item::REF_ITEM &&
  5844. ((Item_ref *) item)->ref_type() == Item_ref::VIEW_REF));
  5845. Field_pair *field_pair;
  5846. Item_equal *item_equal= item->get_item_equal();
  5847. if (item_equal)
  5848. {
  5849. Item_equal_fields_iterator it(*item_equal);
  5850. Item *equal_item;
  5851. while ((equal_item= it++))
  5852. {
  5853. field_pair=
  5854. get_corresponding_field_pair(equal_item, subq_pred->corresponding_fields);
  5855. if (field_pair)
  5856. return field_pair->corresponding_item;
  5857. }
  5858. }
  5859. else
  5860. {
  5861. field_pair=
  5862. get_corresponding_field_pair(item, subq_pred->corresponding_fields);
  5863. if (field_pair)
  5864. return field_pair->corresponding_item;
  5865. }
  5866. return NULL;
  5867. }
  5868. Item *Item_field::in_subq_field_transformer_for_where(THD *thd, uchar *arg)
  5869. {
  5870. Item_in_subselect *subq_pred= (Item_in_subselect *)arg;
  5871. Item *producing_item= get_corresponding_item(thd, this, subq_pred);
  5872. if (producing_item)
  5873. return producing_item->build_clone(thd);
  5874. return this;
  5875. }
  5876. Item *Item_direct_view_ref::in_subq_field_transformer_for_where(THD *thd,
  5877. uchar *arg)
  5878. {
  5879. if (item_equal)
  5880. {
  5881. Item_in_subselect *subq_pred= (Item_in_subselect *)arg;
  5882. Item *producing_item= get_corresponding_item(thd, this, subq_pred);
  5883. DBUG_ASSERT (producing_item != NULL);
  5884. return producing_item->build_clone(thd);
  5885. }
  5886. return this;
  5887. }
  5888. /**
  5889. @brief
  5890. Transforms item so it can be pushed into the IN subquery HAVING clause
  5891. @param thd the thread handle
  5892. @param in_item the item for which pushable item should be created
  5893. @param subq_pred the IN subquery predicate
  5894. @details
  5895. This method finds for in_item that is a field from the left part of the
  5896. IN subquery predicate subq_pred its corresponding item from the right part
  5897. of subq_pred.
  5898. If corresponding item is found, a shell for this item is created.
  5899. This shell can be pushed into the HAVING part of subq_pred select.
  5900. @retval <item*> reference to the created corresponding item shell for in_item
  5901. @retval NULL if mistake occurs
  5902. */
  5903. static Item*
  5904. get_corresponding_item_for_in_subq_having(THD *thd, Item *in_item,
  5905. Item_in_subselect *subq_pred)
  5906. {
  5907. Item *new_item= get_corresponding_item(thd, in_item, subq_pred);
  5908. if (new_item)
  5909. {
  5910. Item_ref *ref=
  5911. new (thd->mem_root) Item_ref(thd,
  5912. &subq_pred->unit->first_select()->context,
  5913. NullS, NullS,
  5914. &new_item->name);
  5915. if (!ref)
  5916. DBUG_ASSERT(0);
  5917. return ref;
  5918. }
  5919. return new_item;
  5920. }
  5921. Item *Item_field::in_subq_field_transformer_for_having(THD *thd, uchar *arg)
  5922. {
  5923. return get_corresponding_item_for_in_subq_having(thd, this,
  5924. (Item_in_subselect *)arg);
  5925. }
  5926. Item *Item_direct_view_ref::in_subq_field_transformer_for_having(THD *thd,
  5927. uchar *arg)
  5928. {
  5929. if (!item_equal)
  5930. return this;
  5931. else
  5932. {
  5933. Item *new_item= get_corresponding_item_for_in_subq_having(thd, this,
  5934. (Item_in_subselect *)arg);
  5935. if (!new_item)
  5936. return this;
  5937. return new_item;
  5938. }
  5939. }
  5940. /**
  5941. @brief
  5942. Find fields that are used in the GROUP BY of the select
  5943. @param thd the thread handle
  5944. @param sel the select of the IN subquery predicate
  5945. @param fields fields of the left part of the IN subquery predicate
  5946. @param grouping_list GROUP BY clause
  5947. @details
  5948. This method traverses fields which are used in the GROUP BY of
  5949. sel and saves them with their corresponding items from fields.
  5950. */
  5951. bool grouping_fields_in_the_in_subq_left_part(THD *thd,
  5952. st_select_lex *sel,
  5953. List<Field_pair> *fields,
  5954. ORDER *grouping_list)
  5955. {
  5956. DBUG_ENTER("grouping_fields_in_the_in_subq_left_part");
  5957. sel->grouping_tmp_fields.empty();
  5958. List_iterator<Field_pair> it(*fields);
  5959. Field_pair *item;
  5960. while ((item= it++))
  5961. {
  5962. for (ORDER *ord= grouping_list; ord; ord= ord->next)
  5963. {
  5964. if ((*ord->item)->eq(item->corresponding_item, 0))
  5965. {
  5966. if (sel->grouping_tmp_fields.push_back(item, thd->mem_root))
  5967. DBUG_RETURN(TRUE);
  5968. }
  5969. }
  5970. }
  5971. DBUG_RETURN(FALSE);
  5972. }
  5973. /**
  5974. @brief
  5975. Extract condition that can be pushed into select of this IN subquery
  5976. @param thd the thread handle
  5977. @param cond current condition
  5978. @details
  5979. This function builds the most restrictive condition depending only on
  5980. the list of fields of the left part of this IN subquery predicate
  5981. (directly or indirectly through equality) that can be extracted from the
  5982. given condition cond and pushes it into this IN subquery.
  5983. Example of the transformation:
  5984. SELECT * FROM t1
  5985. WHERE a>3 AND b>10 AND
  5986. (a,b) IN (SELECT x,MAX(y) FROM t2 GROUP BY x);
  5987. =>
  5988. SELECT * FROM t1
  5989. WHERE a>3 AND b>10 AND
  5990. (a,b) IN (SELECT x,max(y)
  5991. FROM t2
  5992. WHERE x>3
  5993. GROUP BY x
  5994. HAVING MAX(y)>10);
  5995. In details:
  5996. 1. Check what pushable formula can be extracted from cond
  5997. 2. Build a clone PC of the formula that can be extracted
  5998. (the clone is built only if the extracted formula is a AND subformula
  5999. of cond or conjunction of such subformulas)
  6000. 3. If there is no HAVING clause prepare PC to be conjuncted with
  6001. WHERE clause of this subquery. Otherwise do 4-7.
  6002. 4. Check what formula PC_where can be extracted from PC to be pushed
  6003. into the WHERE clause of the subquery
  6004. 5. Build PC_where and if PC_where is a conjunct(s) of PC remove it from PC
  6005. getting PC_having
  6006. 6. Prepare PC_where to be conjuncted with the WHERE clause of
  6007. the IN subquery
  6008. 7. Prepare PC_having to be conjuncted with the HAVING clause of
  6009. the IN subquery
  6010. @note
  6011. This method is similar to pushdown_cond_for_derived()
  6012. @retval TRUE if an error occurs
  6013. @retval FALSE otherwise
  6014. */
  6015. bool Item_in_subselect::pushdown_cond_for_in_subquery(THD *thd, Item *cond)
  6016. {
  6017. DBUG_ENTER("Item_in_subselect::pushdown_cond_for_in_subquery");
  6018. Item *remaining_cond= NULL;
  6019. if (!cond)
  6020. DBUG_RETURN(FALSE);
  6021. st_select_lex *sel = unit->first_select();
  6022. if (is_jtbm_const_tab)
  6023. DBUG_RETURN(FALSE);
  6024. if (!sel->cond_pushdown_is_allowed())
  6025. DBUG_RETURN(FALSE);
  6026. /*
  6027. Create a list of Field_pair items for this IN subquery.
  6028. It consists of the pairs of fields from the left part of this IN subquery
  6029. predicate 'left_part' and the respective fields from the select of the
  6030. right part of the IN subquery 'sel' (the field from left_part with the
  6031. corresponding field from the sel projection list).
  6032. Attach this list to the IN subquery.
  6033. */
  6034. corresponding_fields.empty();
  6035. List_iterator_fast<Item> it(sel->join->fields_list);
  6036. Item *item;
  6037. for (uint i= 0; i < left_expr->cols(); i++)
  6038. {
  6039. item= it++;
  6040. Item *elem= left_expr->element_index(i);
  6041. if (elem->real_item()->type() != Item::FIELD_ITEM)
  6042. continue;
  6043. if (corresponding_fields.push_back(
  6044. new Field_pair(((Item_field *)(elem->real_item()))->field,
  6045. item)))
  6046. DBUG_RETURN(TRUE);
  6047. }
  6048. /* 1. Check what pushable formula can be extracted from cond */
  6049. Item *extracted_cond;
  6050. cond->check_pushable_cond(&Item::pushable_cond_checker_for_subquery,
  6051. (uchar *)this);
  6052. /* 2. Build a clone PC of the formula that can be extracted */
  6053. extracted_cond=
  6054. cond->build_pushable_cond(thd,
  6055. &Item::pushable_equality_checker_for_subquery,
  6056. (uchar *)this);
  6057. /* Nothing to push */
  6058. if (!extracted_cond)
  6059. {
  6060. DBUG_RETURN(FALSE);
  6061. }
  6062. /* Collect fields that are used in the GROUP BY of sel */
  6063. st_select_lex *save_curr_select= thd->lex->current_select;
  6064. if (sel->have_window_funcs())
  6065. {
  6066. if (sel->group_list.first || sel->join->implicit_grouping)
  6067. goto exit;
  6068. ORDER *common_partition_fields=
  6069. sel->find_common_window_func_partition_fields(thd);
  6070. if (!common_partition_fields)
  6071. goto exit;
  6072. if (grouping_fields_in_the_in_subq_left_part(thd, sel, &corresponding_fields,
  6073. common_partition_fields))
  6074. DBUG_RETURN(TRUE);
  6075. }
  6076. else if (grouping_fields_in_the_in_subq_left_part(thd, sel,
  6077. &corresponding_fields,
  6078. sel->group_list.first))
  6079. DBUG_RETURN(TRUE);
  6080. /* Do 4-6 */
  6081. sel->pushdown_cond_into_where_clause(thd, extracted_cond,
  6082. &remaining_cond,
  6083. &Item::in_subq_field_transformer_for_where,
  6084. (uchar *) this);
  6085. if (!remaining_cond)
  6086. goto exit;
  6087. /*
  6088. 7. Prepare PC_having to be conjuncted with the HAVING clause of
  6089. the IN subquery
  6090. */
  6091. remaining_cond=
  6092. remaining_cond->transform(thd,
  6093. &Item::in_subq_field_transformer_for_having,
  6094. (uchar *)this);
  6095. if (!remaining_cond ||
  6096. remaining_cond->walk(&Item::cleanup_excluding_const_fields_processor,
  6097. 0, 0))
  6098. goto exit;
  6099. mark_or_conds_to_avoid_pushdown(remaining_cond);
  6100. sel->cond_pushed_into_having= remaining_cond;
  6101. exit:
  6102. thd->lex->current_select= save_curr_select;
  6103. DBUG_RETURN(FALSE);
  6104. }