MySQL GTID MySQL GTID全面总结
https://mp.weixin.qq.com/s/kiU 人气:001 GTID简介
GTID,全称Global transaction identifiers,也称之为全局事务ID。MySQL-5.6.2开始支持,MySQL-5.6.10后完善,GTID 分成两部分,一部分是服务的UUid,UUID保存在mysql数据目录的auto.cnf文件中,
这是一个非常重要的文件,不能删除,这一部分是不会变的。下面是一个uuid的值举例:
[root@dev01 mysql]# cat auto.cnf [auto] server-uuid=ac1ebad0-ef76-11e7-872b-080027a03bb6
另外一部分就是事务ID了,随着事务的增加,值依次递增。也就是说,GTID实际上是由UUID+TID组成的。其中UUID是一个MySQL实例的唯一标识。TID代表了该实例上已经提交的事务数量。如下所示为一个GTID的例子:
3db33b36-0e51-409f-a61d-c99756e90155:1-14
02 GTID工作原理
1、master更新数据时,会在事务前产生GTID,一同记录到binlog日志中。
2、slave端的i/o 线程将变更的binlog,写入到本地的relay log中。
3、sql线程从relay log中获取GTID,然后对比slave端的binlog是否有记录。
4、如果有记录,说明该GTID的事务已经执行,slave会忽略。
5、如果没有记录,slave就会从relay log中执行该GTID的事务,并记录到binlog。
6、在解析过程中会判断是否有主键,如果没有就用二级索引,如果没有就用全部扫描。
03 GTID的优缺点
优点:
1.一个事务对应一个唯一GTID,一个GTID在一个服务器上只会执行一次
2.GTID是用来代替传统复制的方法,GTID复制与普通复制模式的最大不同就是不需要指定二进制文件名和位置
3.减少手工干预和降低服务故障时间,当主机挂了之后通过软件从众多的备机中提升一台备机为主机
缺点:
1.不支持非事务引擎
2.不支持create table ... select 语句复制(主库直接报错)
原理:( 会生成两个sql,一个是DDL创建表SQL,一个是insert into 插入数据的sql。
由于DDL会导致自动提交,所以这个sql至少需要两个GTID,但是GTID模式下,只能给这个sql生成一个GTID )
3.不允许一个SQL同时更新一个事务引擎表和非事务引擎表
4.开启GTID需要重启(5.7除外)
5.对于create temporary table 和 drop temporary table语句不支持
6.不支持sql_slave_skip_counter
04 测试环境搭建
节点:
server1 192.168.197.128 3306 Master
server2 192.168.197.137 3306 Slave
server3 192.168.197.136 3306 Slave
开启GTID需要启用这三个参数:
gtid_mode = on
enforce_gtid_consistency = 1
log_slave_updates = 1
搭建测试环境的步骤如下:
1.在主节点上创建复制用户,开启主节点的GTID选项;
mysql> grant replication slave on *.* to 'repluser'@'%' identified by '123456'; Query OK, 0 rows affected, 1 warning (0.00 sec)
2.从节点上进行change master to操作,搭建主从,如下:
mysql> change master to -> master_host='192.168.197.128', -> master_user='repluser', -> master_password='123456', -> master_port=3306, -> master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.01 sec)
3.搭建成功后,在主节点197.128上查看从节点是否加入:
mysql> show slave hosts; +-----------+------+------+-----------+--------------------------------------+ | Server_id | Host | Port | Master_id | Slave_UUID | +-----------+------+------+-----------+--------------------------------------+ | 3 | | 3306 | | 969488f5-c486-11e8-adb7-000c29bf2c97 | | 2 | | 3306 | | bb874065-c485-11e8-8b52-000c2934472e | +-----------+------+------+-----------+--------------------------------------+ rows in set (. sec)
查看连接:
mysql> show processlist; +----+----------+------------------+------+------------------+------+---------------------------------------------------------------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+----------+------------------+------+------------------+------+---------------------------------------------------------------+------------------+ | | root | localhost | NULL | Query | 0 | starting | show processlist | | 3 | repluser | work_NAT_4:60051 | NULL | Binlog Dump GTID | | Master has sent all binlog to slave; waiting for more updates | NULL | | | repluser | work_NAT_5: | NULL | Binlog Dump GTID | 5970 | Master has sent all binlog to slave; waiting for more updates | NULL | +----+----------+------------------+------+------------------+------+---------------------------------------------------------------+------------------+ rows in set (. sec)
4.三台测试环境的UUID分别是:
197.128 mysql> select @@server_uuid; +--------------------------------------+ | @@server_uuid | +--------------------------------------+ | bd0d-8691-11e8-afd6-4c3e51db5828 | +--------------------------------------+ row in set (0.00 sec) 197.137 mysql> select @@server_uuid; +--------------------------------------+ | @@server_uuid | +--------------------------------------+ | bb874065-c485-11e8-8b52-000c2934472e | +--------------------------------------+ row in set (0.00 sec) 197.136 mysql> select @@server_uuid; +--------------------------------------+ | @@server_uuid | +--------------------------------------+ | f5-c486-11e8-adb7-000c29bf2c97 | +--------------------------------------+ row in set (0.00 sec)
05 开始测试
测试环境主要分为以下几个方面:
a.测试复制的故障转移
b.复制错误跳过
1 测试复制的故障转移
先来看看测试复制的故障转移:
(1)首先将server 3的复制过程停掉
mysql> stop slave; Query OK, 0 rows affected (0.01 sec)
(2)在server 1上创建一些数据
mysql> create table yyy.a(id int); Query OK, 0 rows affected (0.03 sec) mysql> create table yyy.b(id int); Query OK, 0 rows affected (0.02 sec) mysql> create table yyy.c(id int); Query OK, 0 rows affected (0.02 sec)
(3)在另外两台上面查看数据结果:
server mysql> show tables from yyy; +---------------+ | Tables_in_yyy | +---------------+ | a | | b | | c | +---------------+ rows in set (0.00 sec) server mysql> show tables from yyy; Empty set (0.00 sec)
(4)此时可以发现,server 2 的数据相比较server 3,它的数据比较新,此时停止server 1,模拟主服务器宕机:
[root@work_NAT_1 init.d]# service mysqld stop Shutting down MySQL............ [ OK ]
(5)此时我们发现其他两个节点已经不能访问server 1了
mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Reconnecting after a failed master event read Master_Host: 192.168.197.128 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000006 Read_Master_Log_Pos: 1364 Relay_Log_File: mysql-relay-bin.000004 Relay_Log_Pos: 1569 Relay_Master_Log_File: mysql-bin.000006 Slave_IO_Running: Connecting Slave_SQL_Running: Yes Exec_Master_Log_Pos: 1364 Relay_Log_Space: 2337 Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 2003 Last_IO_Error: error reconnecting to master 'repluser@192.168.197.128:3306' - retry-time: 60 retries: 1 Last_SQL_Errno: 0
(6)我们需要设置server 2为server 3的主库,因为server 2的数据比较新。此时如果采用以前的办法,需要计算之前主库的log_pos和当前要设置成主库的log_pos,很有可能出错。所以出现了一些高可用性的工具如MHA,MMM等解决问题。
在MySQL5.6之后,很简单的解决了这个难题。因为同一事务的GTID在所有节点上的值一致,那么根据server3当前停止点的GTID就能定位到server2上的GTID,所以直接在server3上执行change即可:
mysql> change master to -> master_host='192.168.197.137', -> master_user='repluser', -> master_password='123456', -> master_port=, -> master_auto_position=; Query OK, rows affected, warnings (0.01 sec)
(7)此时查看server 3上的数据,可以发现,数据已经同步过来了;
2 复制错误跳过
上面的测试中,最终的结果是server 2是主节点,server 3是从节点,下面我们来验证复制错误跳过的办法。
(1)首先我们在从节点上执行一个drop的语句,让两边的数据不一致,如下:
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | DBAs | | customer | | inc_db | | mysql | | performance_schema | | sys | | testdb | | yeyz | | yyy | +--------------------+ rows in set (. sec) mysql> drop database yyy; Query OK, rows affected (. sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | DBAs | | customer | | inc_db | | mysql | | performance_schema | | sys | | testdb | | yeyz | +--------------------+ rows in set (. sec)
(2)然后我们在server 2上执行drop database yyy的操作,如下:
mysql> drop database yyy; Query OK, 3 rows affected (0.02 sec)
(3)此时我们看到server 3上已经出现了主从不同步的错误警告,因为它上面并没有yyy的数据库(前一步已经删除),错误情况如下;
mysql> show slave status\G *************************** . row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.197.137 Master_User: repluser Master_Port: Connect_Retry: Master_Log_File: mysql-bin. Read_Master_Log_Pos: Relay_Log_File: mysql-relay-bin. Relay_Log_Pos: Relay_Master_Log_File: mysql-bin. Slave_IO_Running: Yes Slave_SQL_Running: No Last_Errno: Last_Error: Error 'Can't drop database 'yyy'; database doesn't exist' on query. Default database: 'yyy'. Query: 'drop database yyy' Skip_Counter: Exec_Master_Log_Pos: Relay_Log_Space: Last_SQL_Error: Error 'Can't drop database 'yyy'; database doesn't exist' on query. Default database: 'yyy'. Query: 'drop database yyy' Replicate_Ignore_Server_Ids: Master_Server_Id: Master_UUID: bb874065-c485-e8-b52-c2934472e Master_Info_File: mysql.slave_master_info Retrieved_Gtid_Set: bd0d--e8-afd6-c3e51db5828:-, bb874065-c485-e8-b52-c2934472e: Executed_Gtid_Set: db33b36-e51-f-a61d-c99756e90155:-, bd0d--e8-afd6-c3e51db5828:-, f5-c486-e8-adb7-c29bf2c97: Auto_Position: Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: row in set (0.00 sec)
(4)当我们使用传统的方法来跳过这个错误的时候,会提示出GTID模式下不被允许,如下:
mysql> set global sql_slave_skip_counter=; ERROR (HY000): sql_slave_skip_counter can not be set when the server is running with @@GLOBAL.GTID_MODE = ON. Instead, for each transaction that you want to skip, generate an empty transaction with the same GTID as the transaction
那么这种方式下应该如何跳过这个错误呢?
(5)因为我们是通过GTID来进行复制的,也需要跳过这个事务从而继续复制,这个事务可以到主上的binlog里面查看:因为不知道找哪个GTID上出错,所以也不知道如何跳过哪个GTID。但是我们可以在show slave status里的信息里找到在执行Master里的POS:2012,也就是上述第(3)步第18行代码。现在我们拿着这个pos:2012去server 2的日志里面找,可以发现如下信息:
# at 2012 #190305 20:59:07 server id 2 end_log_pos 2073 GTID last_committed=9 sequence_number=10 rbr_only=no SET @@SESSION.GTID_NEXT= 'bb874065-c485-11e8-8b52-000c2934472e:1'/*!*/; # at 2073 #190305 20:59:07 server id 2 end_log_pos 2158 Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=/*!*/; drop database yyy /*!*/;
(6)我们可以看到GTID_NEXT的值是
,然后我们通过下面的方法来重新恢复主从复制:
mysql> stop slave; Query OK, rows affected (0.00 sec) mysql> set session gtid_next='bb874065-c485-11e8-8b52-000c2934472e:1'; Query OK, rows affected (0.00 sec) mysql> begin; Query OK, rows affected (0.00 sec) mysql> commit; Query OK, rows affected (0.01 sec) mysql> set session gtid_next=automatic; Query OK, rows affected (0.00 sec) mysql> start slave; Query OK, rows affected (0.00 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.197.137 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000002 Read_Master_Log_Pos: 2158 Relay_Log_File: mysql-relay-bin.000003 Relay_Log_Pos: 478 Relay_Master_Log_File: mysql-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 2158 Relay_Log_Space: 1527 Until_Condition: None Master_Server_Id: 2 Master_UUID: bb874065-c485-11e8-8b52-000c2934472e Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: Retrieved_Gtid_Set: bd0d-8691-11e8-afd6-4c3e51db5828:-7, bb874065-c485-11e8-8b52-000c2934472e: Executed_Gtid_Set: db33b36-0e51-409f-a61d-c99756e90155:-14, bd0d-8691-11e8-afd6-4c3e51db5828:-7, f5-c486-11e8-adb7-000c29bf2c97:, bb874065-c485-11e8-8b52-000c2934472e: Auto_Position: Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: row in set (0.00 sec)
加载全部内容