Wednesday, August 22, 2012

MySQL 5.6 replication gotchas (and bugs)

There has been a lot of talk about MySQL 5.6 replication improvements. With few exceptions, what I have seen was either marketing messages or hearsay. This means that few people have really tried out the new features to see whether they meet the users needs.

As usual, I did try the new version in my environment. I like to form my own opinion based on experiments, and so I have been trying out these features since they have appeared in early milestones.

What follows is a list of (potentially) surprising results that you may get when using MySQL 5.6.
All the examples are made using MySQL 5.6.6.

Gotcha #1 : too much noise

I have already mentioned that MySQL 5.6 is too verbose when creating data directory. This also means that your error log may have way more information than you'd like to get. You should check the contents of the error log when you start, and either clean it up before using it on a regular basis or take note of what's there after a successful installation, so you won't be surprised when something goes wrong.

Gotcha #2 : Innodb tables where you don't expect them

Until version 5.5, after you installed MySQL, you could safely drop the ib* files, change the configuration file, and restart MySQL with optimized parameters. Not anymore.

When you run mysqld with the --bootstrap option (which is what mysql_install_db does), the server creates 5 innodb tables:

 select table_schema, table_name 
from information_schema .tables 
where engine='innodb';
+--------------+----------------------+
| table_schema | table_name           |
+--------------+----------------------+
| mysql        | innodb_index_stats   |
| mysql        | innodb_table_stats   |
| mysql        | slave_master_info    |
| mysql        | slave_relay_log_info |
| mysql        | slave_worker_info    |
+--------------+----------------------+

The slave_* tables are needed for the safe crash slave feature, which we'll cover later. The innodb_*_stats tables are as documented at Innodb persistent stats, and they seem to contain almost the same info of the tables with the same name that you find in Percona Server INFORMATION_SCHEMA. I can only speculate why these tables are in mysql rather than in performance_schema.

Another side effect of this issue is that, whatever setting you want to apply to innodb (size of the data files, file-per-table, default file format, and so on) must be done when you run mysqld --bootstrap.

Gotcha #3 : Global transaction IDs and security

The information about Global transaction ID is not easy to locate. But eventually, searching the manual, you will get it. The important information that you take from this page is that this feature only works if you enable all these options in all the servers used for replication:

log-bin
server-id=xx
log-slave-updates 
gtid-mode=ON
disable-gtid-unsafe-statements

The first two options are not a surprise. You need them for replication anyway. Check.

The third one is puzzling. Why would you want this option in a master? But then you realize that this will allow any server to be promoted or demoted at will. Check.

gtid-mode is the main option that needs to be enabled for global transaction IDs. Check

The last option forces the server to be safe, by using only transactional tables, and by forbidding things like temporary tables inside transactions and create table ... select. Which means that if you try to update a MyISAM table in the master, the statement will fail. You won't be allowed to do it. Check?

The trouble is, if you enable gtid-mode=ON (with its mandatory ancillary options), you can't run mysql_secure_installation, because that utility needs to delete anonymous users and clean the 'db' table for anonymous usage of the 'test' database.

The workaround is to enable GTID after you secure the installation, which means one more server restart.

Gotcha #4 (bug): multi thread slave won't work without safe-crash slave tables

To enable parallel replication, you need to change the value of 'slave_parallel_workers" to a value between 1 and 1024.

show variables like '%worker%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| slave_parallel_workers | 0     |
+------------------------+-------+
1 row in set (0.00 sec)

slave1 [localhost] {msandbox} ((none)) > stop slave;
Query OK, 0 rows affected (0.06 sec)

slave1 [localhost] {msandbox} ((none)) > set global slave_parallel_workers=5;
Query OK, 0 rows affected (0.00 sec)

slave1 [localhost] {msandbox} (mysql) > start slave;
Query OK, 0 rows affected, 1 warning (0.05 sec)

slave1 [localhost] {msandbox} ((none)) > select * from mysql.slave_worker_info\G
Empty set (0.00 sec)

What the hell? The workers table is empty.

I know the cause: the slave_worker_info table is not activated unless you also set relay_log_info_repository='table'. What I don't understand is WHY it is like that. If this is documented, I could not find where.

Anyway, once you are in this bizarre condition, you can't activate relay_log_info_repository='table', because of the following

Gotcha #5 (bug) : master and relay_log repository must be either set forever or they will fail

After we have activated parallel threads, without enabling table repositories, you can't easily get to a clean replication environment:
set global relay_log_info_repository='table';
start slave;
ERROR 1201 (HY000): Could not initialize master info structure; more error messages can be found in the MySQL error log
And the error log says:
120822 14:15:08 [ERROR] Error creating relay log info: Error transfering information.

What you need to do is

  • stop the slave
  • enable both master_info_repository and relay_log_info_repository as 'table'
  • set the number of parallel threads
  • restart the slave
slave1 [localhost] {msandbox} (mysql) > stop slave;
Query OK, 0 rows affected (0.02 sec)

slave1 [localhost] {msandbox} (mysql) > set global master_info_repository='table';
Query OK, 0 rows affected (0.00 sec)

slave1 [localhost] {msandbox} (mysql) > set global relay_log_info_repository='table';
Query OK, 0 rows affected (0.00 sec)

slave1 [localhost] {msandbox} (mysql) > set global slave_parallel_workers=5;
Query OK, 0 rows affected (0.00 sec)

slave1 [localhost] {msandbox} (mysql) > start slave;
Query OK, 0 rows affected, 1 warning (0.01 sec)

slave1 [localhost] {msandbox} (mysql) > select count(*) from slave_worker_info;
+----------+
| count(*) |
+----------+
|        5 |
+----------+
1 row in set (0.00 sec)

This sequence of commands will start parallel replication, although MySQL crashes when restarting the slave.

Gotcha #6 : Global transaction IDs not used in parallel threads

Global transaction IDs (GTIDs) are very useful when you need to switch roles from master to slave, and especially when you deal with unplanned failovers. They are also a great simplification in many cases where you need to identify a transaction without getting lost in the details of binary log file and position.

However, one of the cases where GTIDs would have been most useful, they are not there. The table mysql.slave_workers_info still identifies transactions by binary log and position. Similarly, CHANGE MASTER TO does not use GTIDs, other than allowing the automatic alignment (MASTER_AUTO_POSITION=1). If you need to perform any fine tuning operations, you need to revert to the old binary log + position.

15 comments:

Lars Thalmann said...

Thanks Giuseppe!

I'm very happy to see the great feedback you are providing for the new MySQL 5.6 features!

This kind of detailed comments are very valuable to us. We are working through all your comments and they will help us make MySQL even better.

Thanks,
Lars Thalmann
MySQL Development Director, Oracle

Justin Swanhart said...

I think index_stats and table_stats are in the `mysql` dir because these represent the persistent InnoDB stats that can be modified by the end user.

Just a guess though!

hingo said...

Regarding Gotcha #3: Is it possible to do CREATE USER and similar admin statements, which internally modify the mysql.* tables?

Giuseppe Maxia said...

@hingo,
Yes. CREATE USER, GRANT, and REVOKE work as expected. Same goes for CREATE EVENT|PROCEDURE|FUNCTION.
But direct manipulation of mysql.* MyISAM tables fails.

Cédric said...

Thx for this review Giuseppe.
"too much noise" can't be more than noise produced by XtraDB Cluster (Galera) ;-)

Sheeri K. Cabral said...

I'm confused about gotcha #2. If you change things like size of the data files, innodb_file_per_table, default file format, you have to restart the server anyway. So what's different now?

Giuseppe Maxia said...

@Sheeri,
The differences are:
1) unlike MySQL 5.5, you can't install MySQL and then change the innodb data file size. You could do that, because --bootstrap did not create any innodb tables. Now you get a conflict because ibdata* files have been already created. Previously, you could delete ibdata* and create it again with the new size. Now you can't.
2) If you want to adjust the innodb tables (change file size, change innodb-file-per-table, get rid of unused space), the method that worked until 5.5 was: (a) dump all databases; (b) drop all databases except mysql; (c) delete all ib* files; (d) restore.
Now you can't do that. You will need to drop the innodb tables inside mysql, drop the databases, and then do your backup and restore. It is not a big deal, once you know it, but if you have an established routine, this will spoil it.

rubenc said...

If you don't like certain features well.. just stick on 5.5, it's simple.

Also, you call "bugs" to non-bugs thingies, and keep in mind that you're talking about a DEVELOPMENT (you know what that means? Means you can find bugs, so where's the big deal?) version, so...

Regards.

Giuseppe Maxia said...

@rubenc,
what's your point? that I should shut up and do something else?
I believe that sharing my findings could be useful for both the developers and the users.

About bugs, I have a broad definition about what is a bug. Feel free to disagree on both counts.

Giuseppe Maxia said...

@rubenc,
what's your point? that I should shut up and do something else?
I believe that sharing my findings could be useful for both the developers and the users.

About bugs, I have a broad definition about what is a bug. Feel free to disagree on both counts.

Dr. Tyrell said...

FYI
disable-gtid-unsafe-statements
Obsolete: Replaced by --enforce-gtid-consistency in MySQL 5.6.9. (Bug #14775984)

Giuseppe Maxia said...

@DrTyrell,
Thanks for the info. It was, however, already noted in a later article in this same blog.

Anonymous said...

Regarding Gotcha #6 : Global transaction IDs not used in parallel threads

1. To use multi thread replication, is it must to enable GTID? or can we use multi thread even when GTID is turned off in mysql 5.6 ?
2. If GTID is a must to use multi thread, you are saying parallel replication still uses binlog position and not transaction ID's ?

Thanks

Giuseppe Maxia said...

@anonymous,
you don't need GTID to use parallel replication. The point is that, if you use GTID, there is no facility to monitor parallel replication with GTIDs.

Anonymous said...

Question Related to Multi Thread Replication.

* What if the last transaction in binlog_1 is a huge blog insert for db1.tb1 and if first transaction in binlog2 is for db2.tb2.
* Does binlogs get transmitted in First-In-First-Out order which will cause replication to stall for thread1 to finish even when thread2 is ready for next event
OR
Does it use more than one binlog.

Thanks