求助:master-master复制遇到问题,贴子中具体描述
1.设置描述:主机名:master1,master2
master1和master2 组成master-master互相复制
master1是active master,master2是passive master
2.问题描述
master2从master1复制情况一切正常,问题出在master1从master2进行复制
(1)在master1上查看slave状态:
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: XXXXX
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql_bin.000001
Read_Master_Log_Pos: 177339
Relay_Log_File: relay-bin.000003
Relay_Log_Pos: 235
Relay_Master_Log_File: mysql_bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: XXXXXXXXXX
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 177339
Relay_Log_Space: 235
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
1 row in set (0.00 sec)
ERROR:
No query specified
说明:连续查看slave状态都是同样的结果,Read_Master_Log_Pos 和 Relay_Log_Space都不会变,正常情况这两个变量是不停的在变化的(因为这是在production环境)。但是relay log的文件名会更换,从relay-bin.000001到relay-bin.000003了,但是文件的大小就是从来没有变过,一直都是235,说明relay log里面根本就没有内容被写进去。
(2)在master1上查看master状态
mysql> show master status\G;
*************************** 1. row ***************************
File: mysql_bin.000001
Position: 57086958
Binlog_Do_DB: XXXXX
Binlog_Ignore_DB:
1 row in set (0.00 sec)
ERROR:
No query specified
说明:这里的Position是在跑的,不停的在变
(3)在master2上再查看proceelist:
mysql> show processlist;
+------+-------------+-----------+------+---------+------------+-----------------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+-------------+-----------+------+---------+------------+-----------------------------------------------------------------------+------------------+
| 4384 | system user | | NULL | Connect | 4952 | Waiting for master to send event | NULL |
| 4385 | system user | | NULL | Connect | 4294967295 | Has read all relay log; waiting for the slave I/O thread to update it | NULL |
| 6330 | root | localhost | NULL | Query | 0 | NULL | show processlist |
+------+-------------+-----------+------+---------+------------+-----------------------------------------------------------------------+------------------+
说明:在这里根本看不到master1用于复制的那个用户repl。
当我在master1上stop slave;start slave;之后,就可以在master2上看见repl用户了:
mysql> show processlist;
+------+-------------+-------------------+------+-------------+------+-----------------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+-------------+-------------------+------+-------------+------+-----------------------------------------------------------------------+------------------+
| 4384 | system user | | NULL | Connect | 4999 | Waiting for master to send event | NULL |
| 4385 | system user | | NULL | Connect | 0 | Has read all relay log; waiting for the slave I/O thread to update it | NULL |
| 6330 | root | localhost | NULL | Query | 0 | NULL | show processlist |
| 6382 | repl | xxxxx:24541 | NULL | Binlog Dump | 3 | Writing to net | NULL |
+------+-------------+-------------------+------+-------------+------+-----------------------------------------------------------------------+------------------+
4 rows in set (0.00 sec)
但是master1上面,slave status上面的Read_Master_Log_Pos 和 Relay_Log_Space始终都没有变化。
看repl这个用于复制的用户状态是writing to net,说明正在将binlog的内容写到packet里面传到master1,但是大约80秒后,这个repl的process就消失了,再也没有了,除非在重新启动mater1的复制。这是否说明从master2传binlog的内容到master1 time out了呢?master1没有响应,3次握手失败?
我已经试过从master1和master2互相telnet,都可以建立连接的。
请高手不吝赐教啊!
自己重要找到点线索,现在可以肯定,状态只要是“writing to net”,一定是网络有问题
页:
[1]