主从复制1062错误解决方法
当复制中断的时候,我们常用的方法是跳过错误,比如SET GLOBAL SQL_SLAVE_SKIP_COUNTER =1或者直接slave-skip-errors=1062,这样确实解决了问题,恢复了复制。但是久而久之主从数据相差就很大了。对于复制正常以后,我们还需要使用pt-table-checksum数据效验,以及pt-table-sync同步数据,今天线上一个从库中断,查看了错误,发现是主键冲突,至于为何会发现这些情况,请阅读MySQL Replication需要注意的问题
Last_Error: Error \'Duplicate entry \'192442\' for key \'PRIMARY\'\' on query. Default database: \'xxxxxxxxxxx\'. Query: \'INSERT INTO xxxxxxxxxxxxx(playerId, `type`, `count`) VALUES( NAME_CONST(\'pPlayerId\',629014986), NAME_CONST(\'pType\',8), 0)\' Skip_Counter: 0 Exec_Master_Log_Pos: 1499475 Relay_Log_Space: 4829077512 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1062 Last_SQL_Error: Error \'Duplicate entry \'192442\' for key \'PRIMARY\'\' on query. Default database: \'xxxxxxxxxxxxxx\'. Query: \'INSERT INTO xxxxxxxxxxx(playerId, `type`, `count`) VALUES( NAME_CONST(\'pPlayerId\',629014986), NAME_CONST(\'pType\',8), 0)\' 1 row in set (0.00 sec)
于是根据提示,以及查看表结构,发现果然是主键冲突了,但是表的记录却不一致,所以往往我们跳过错误就会导致主从数据不一致的问题。
主库上的记录:
mysql> select * from xxxx.xxxx where id=192442; +--------+-----------+------+-------+-------+---------------+ | id | playerId | type | count | total | lastResetTime | +--------+-----------+------+-------+-------+---------------+ | 192442 | 629014986 | 8 | 0 | 0 | 0 | +--------+-----------+------+-------+-------+---------------+ 1 row in set (0.00 sec) mysql>
从库上的记录:
mysql> select * from xxxx.xxxx where id=192442; +--------+-----------+------+-------+-------+---------------+ | id | playerId | type | count | total | lastResetTime | +--------+-----------+------+-------+-------+---------------+ | 192442 | 629015414 | 8 | 0 | 0 | 0 | +--------+-----------+------+-------+-------+---------------+ 1 row in set (0.00 sec) mysql>
有时候我会手动删除从库上提示的相应记录,但是往往是比较麻烦的。因为可能不止一条记录重复,可能N条,所以我就简单的写了一个脚本,删除主键冲突的记录,这个脚本只适合主键冲突的情况。
#!/bin/bash #Delete duplicate records primary key conflict #Write by yayun 2014-05-17 mysql=/usr/local/mysql-5.1.66/bin/mysql sock=/data/mysql-slave-3311/mysql.sock passwd=123456 while true do SQL_THREAD=`$mysql -uroot -p$passwd -S $sock -e \'show slave status\G\' | egrep \'Slave_SQL_Running\' | awk \'{print $2}\'` LAST_ERROR=`$mysql -uroot -p$passwd -S $sock -e \'show slave status\G\' | egrep Last_Errno | awk \'{print $2}\'` duplicate=`$mysql -uroot -p$passwd -S $sock -e \'show slave status\G\' | grep Last_Error | awk \'/Duplicate entry/{print $5}\' | awk -F "\'" \'{print $2}\'` DATABASE=`$mysql -uroot -p$passwd -S $sock -e \'show slave status\G\' | grep Last_Error | awk \'{print $13}\' | awk -F "\'" \'{print $2}\'` TABLE=`$mysql -uroot -p$passwd -S $sock -e \'show slave status\G\' | grep Last_Error | awk -F ":" \'{print $4}\' | awk -F "(" \'{print $1}\' | awk \'{print $NF}\'` $mysql -uroot -p$passwd -S $sock -e \'show slave status\G\' | grep HA_ERR_FOUND_DUPP_KEY if [ $? -eq 1 ] then if [ "$SQL_THREAD" == No ] && [ "$LAST_ERROR" == 1062 ] then FILED=`$mysql -uroot -p$passwd -S $sock -Nse "desc $DATABASE.$TABLE" | grep PRI | awk \'{print $1}\'` $mysql -uroot -p$passwd -S $sock -e "delete from $DATABASE.$TABLE where $FILED=$duplicate" $mysql -uroot -p$passwd -S $sock -e "start slave sql_thread" else echo "====================== ok ========================" $mysql -uroot -p$passwd -S $sock -e \'show slave status\G\' | egrep \'Slave_.*_Running\' echo "====================== ok ========================" break fi fi done
脚本随便写的。大家可以再自己相应的扩展一下。^_^