Skip to content

MDEV-36483 Optimizer Trace Replay step #1: dump DDLs of tables/views #4034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

bsrikanth-mariadb
Copy link

@bsrikanth-mariadb bsrikanth-mariadb force-pushed the 12.1-MDEV-36483-dump-ddls-of-tables-views branch from eec9242 to 969a713 Compare May 12, 2025 03:30
@spetrunia spetrunia self-requested a review May 14, 2025 13:18
This feature stores the ddls of the tables/views that are used in
a query, to the optimizer trace. It is currently controlled by a
system variable store_ddls_in_optimizer_trace, and is not enabled
by default. All the ddls will be stored in a single json array, with each
element having table/view name, and the associated create definition
of the table/view.

The approach taken is to read global query_tables from the thd->lex,
and read them in reverse. Create a record with table_name, ddl of
the table and add simulatenously to the hash, and records list.
dbName_plus_tableName is used as a key, and the duplicate entries are
not added to the hash. Once all the tables from global query_tables
are read and added to the hash, and records list, the records list
is iterated to fetch one record at a time and its content is finally
put into the trace's list_ddls array object.
@bsrikanth-mariadb bsrikanth-mariadb force-pushed the 12.1-MDEV-36483-dump-ddls-of-tables-views branch 4 times, most recently from c31b8fd to c1c1a62 Compare May 23, 2025 10:58
Copy link
Member

@spetrunia spetrunia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've already requested: Please apply the output of

git diff -U0 --no-color --relative 4b79d7b8ee557d53a859aedec839b8673585b514 | clang-format-diff -p1 -i

Here it is:

diff --git a/sql/opt_trace_ddl_info.cc b/sql/opt_trace_ddl_info.cc
index c9ce9645781..d6b729efa60 100644
--- a/sql/opt_trace_ddl_info.cc
+++ b/sql/opt_trace_ddl_info.cc
@@ -12,7 +12,8 @@
 
    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1335  USA */
+   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1335
+   USA */
 
 #include "opt_trace_ddl_info.h"
 #include "sql_show.h"
@@ -26,7 +27,7 @@
   @file
 
   @brief
-    Stores the ddls of the tables, and views that are used 
+    Stores the ddls of the tables, and views that are used
     in either SELECT, INSERT, DELETE, and UPDATE queries,
     into the optimizer trace. All the ddls are stored together
     at one place as a JSON array object with name "list_ddls"
@@ -42,7 +43,8 @@ struct DDL_Info_Record
   helper function to know the key portion of the record
   that is stored in hash, and record list.
 */
-static const uchar *get_rec_key(const void *entry_, size_t *length, my_bool flags)
+static const uchar *get_rec_key(const void *entry_, size_t *length,
+                                my_bool flags)
 {
   auto entry= static_cast<const DDL_Info_Record *>(entry_);
   *length= strlen(entry->name);
@@ -52,7 +54,8 @@ static const uchar *get_rec_key(const void *entry_, size_t *length, my_bool flag
 /*
   helper function to free record from the hash
 */
-static void free_rec(void *entry_) {
+static void free_rec(void *entry_)
+{
   auto entry= static_cast<DDL_Info_Record *>(entry_);
   my_free(entry->name);
   my_free(entry->stmt);
@@ -110,15 +113,15 @@ static void create_view_def(THD *thd, TABLE_LIST *table, String *name,
 }
 
 /*
-  Stores the ddls of the tables, and views that are used 
+  Stores the ddls of the tables, and views that are used
   in either SELECT, INSERT, DELETE, and UPDATE queries,
   into the optimizer trace.
   Global query_tables are read in reverse order from the thd->lex,
   and a record with table_name, and ddl of the table are created.
   Hash is used to store the records, where in no duplicates
   are stored. dbName_plus_tableName is used as a key to discard any
-  duplicates. If a new record that is created is not in the hash, 
-  then that is dumped into the trace.  
+  duplicates. If a new record that is created is not in the hash,
+  then that is dumped into the trace.
 */
 void store_table_definitions_in_trace(THD *thd)
 {
diff --git a/sql/opt_trace_ddl_info.h b/sql/opt_trace_ddl_info.h
index 7e130ce6a05..8f4e8ef49c0 100644
--- a/sql/opt_trace_ddl_info.h
+++ b/sql/opt_trace_ddl_info.h
@@ -12,7 +12,8 @@
 
    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1335  USA */
+   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1335
+   USA */
 
 #ifndef OPT_TRACE_DDL_INFO
 #define OPT_TRACE_DDL_INFO
diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc
index 609e4ac242d..9644a912b2e 100644
--- a/sql/sql_parse.cc
+++ b/sql/sql_parse.cc
@@ -92,7 +92,7 @@
 #include "opt_trace.h"
 #include "mysql/psi/mysql_sp.h"
 
-#include "my_json_writer.h" 
+#include "my_json_writer.h"
 #include "opt_trace_ddl_info.h"
 #define FLAGSTR(V,F) ((V)&(F)?#F" ":"")
 

sql/sql_show.h Outdated
@@ -82,7 +82,6 @@ int get_all_tables(THD *thd, TABLE_LIST *tables, COND *cond);
int show_create_table(THD *thd, TABLE_LIST *table_list, String *packet,
Table_specification_st *create_info_arg,
enum_with_db_name with_db_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this change.

sql/sql_class.cc Outdated
@@ -1728,7 +1728,6 @@ void THD::cleanup(void)
statement_rcontext_reinit();
auto_inc_intervals_forced.empty();
auto_inc_intervals_in_cur_stmt_for_binlog.empty();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this change also please.

Copy link
Member

@spetrunia spetrunia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More smaller comments.

tbl->table->s->tmp_table == TRANSACTIONAL_TMP_TABLE)))
{
size_t name_len=
strlen(tbl->table_name.str) + strlen(tbl->get_db_name().str) + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use tbl->table_name.length, not strlen. Same with get_db_name.length.

else
show_create_table(thd, tbl, &ddl, NULL, WITH_DB_NAME);

buf= ddl.c_ptr();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ddl.length(), no need for strlen.


void store_table_definitions_in_trace(THD *thd);

#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add new line at end of file.

}
my_hash_free(&hash);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add newline at the end of file.

return true;
else if (get_table_category(tbl->get_db_name(), tbl->get_table_name()) !=
TABLE_CATEGORY_USER ||
strcmp(tbl->get_db_name().str, "sys") == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why strcmp(tbl->get_db_name().str, "sys") ? Please explain.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any "sys" table should return true. I felt, get_table_category() has fewer restrictions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the "sys" database, I see only VIEWs (over information_schema and performance_schema tables) and one base table - sys.sys_config .
I don't think factoring out "sys" gives any value here, please remove it.

P_S and I_S tables are filtered out by the get_table_category()...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the "sys" database, I see only VIEWs (over information_schema and performance_schema tables) and one base table - sys.sys_config . I don't think factoring out "sys" gives any value here, please remove it.

P_S and I_S tables are filtered out by the get_table_category()...

Yes, I saw some of the tests having sys.sys_config actually failed, when invoking show_create_table().

The record has full table_name,
and ddl of the table/view.
*/
static void dump_table_definition(THD *thd, DDL_Info_Record *rec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently this is also used for dumping VIEWs. Please rename.

tbl->table->s->tmp_table == TRANSACTIONAL_TMP_TABLE)))
{
size_t name_len=
tbl->get_table_name().length + tbl->get_db_name().length + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this allocates memory for db_name, the '.' and the table_name. Where is the space for \0 at the end?
Why not use String object here?

Copy link
Author

@bsrikanth-mariadb bsrikanth-mariadb May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this allocates memory for db_name, the '.' and the table_name. Where is the space for \0 at the end? Why not use String object here?

Yes, I could have used the String object. But, this requires allocating new memory anyways, so as to copy the string contents, as the string object is freed after each iteration.

@spetrunia spetrunia self-requested a review May 26, 2025 12:07
Copy link
Member

@spetrunia spetrunia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see any tests for querying tables located in different databases. If they are not there, please add them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants