今天同事在主庫在給一個大表的欄位新加了一個索引,因為是大表,所以用了pt-osc工具,在添加完索引沒過多久,開發那邊反應丟數據了。 這個表以前是寫在主庫的,後來不知道是什麼原因改成了寫從庫,也就是說主庫有部分以前的歷史數據,但是新的數據都是寫在從庫上,同事用pt-osc加索引的時候是在主庫加的,於是 ...
今天同事在主庫在給一個大表的欄位新加了一個索引,因為是大表,所以用了pt-osc工具,在添加完索引沒過多久,開發那邊反應丟數據了。
這個表以前是寫在主庫的,後來不知道是什麼原因改成了寫從庫,也就是說主庫有部分以前的歷史數據,但是新的數據都是寫在從庫上,同事用pt-osc加索引的時候是在主庫加的,於是把主庫的歷史數據同步過來了,從庫上的新數據丟了。
通過以下例子也以看出為什麼會發生這種情況:
>select count(*) from goods;
+----------+
| count(*) |
+----------+
| 1426200 |
+----------+
CREATE TABLE `goods` (
`rec_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`user_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`goods_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`add_time` int(11) unsigned NOT NULL DEFAULT '0',
`is_attention` tinyint(1) NOT NULL DEFAULT '0',
`wid` tinyint(3) unsigned NOT NULL DEFAULT '1' COMMENT '倉id,預設為1表示中國倉',
PRIMARY KEY (`rec_id`),
KEY `user_id` (`user_id`),
KEY `goods_id` (`goods_id`),
KEY `is_attention` (`is_attention`)
) ENGINE=InnoDB AUTO_INCREMENT=1721748 DEFAULT CHARSET=utf8
我們先把goods表上的user_id索引刪除:
alter table goods drop key user_id;
在主庫上刪除部分數據:
>set sql_log_bin=off;
>delete from goods where rec_id>721747;
Query OK, 848254 rows affected (20.24 sec)
>set sql_log_bin=on;
>select count(*) from test;
+----------+
| count(*) |
+----------+
| 577946 |
+----------+
測試一:
先直接用alter table語句對goods表的user_id欄位加索引:
>alter table test add key (user_id);
主庫:
show create table test\G
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`rec_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`user_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`goods_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`add_time` int(11) unsigned NOT NULL DEFAULT '0',
`is_attention` tinyint(1) NOT NULL DEFAULT '0',
`wid` tinyint(3) unsigned NOT NULL DEFAULT '1' COMMENT '倉id,預設為1表示中國倉',
PRIMARY KEY (`rec_id`),
KEY `goods_id` (`goods_id`),
KEY `is_attention` (`is_attention`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1721748 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
>select count(*) from test;
+----------+
| count(*) |
+----------+
| 577946 |
+----------+
1 row in set (0.22 sec)
從庫:
show create table test\G
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`rec_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`user_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`goods_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`add_time` int(11) unsigned NOT NULL DEFAULT '0',
`is_attention` tinyint(1) NOT NULL DEFAULT '0',
`wid` tinyint(3) unsigned NOT NULL DEFAULT '1' COMMENT '倉id,預設為1表示中國倉',
PRIMARY KEY (`rec_id`),
KEY `goods_id` (`goods_id`),
KEY `is_attention` (`is_attention`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1721748 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
>select count(*) from test;
+----------+
| count(*) |
+----------+
| 1426200 |
+----------+
1 row in set (0.47 sec)
可以看到直接用alter table加索引的時候從庫的數據沒有變化。
測試二:
用pt-osc工作在主庫goods表的user_id列加索引
先刪除user_id列的索引,在主庫上執行:
alter table test drop index user_id;
pt-osc加索引:
pt-online-schema-change --nocheck-replication-filters --recursion-method=none --charset=utf8 --alter "add index (user_id) " h=localhost,P=3306,u=root,p=123456,D=test,t=test --print --execute
Successfully altered `test`.`test`. 提示索引加成功。
主庫:
>select count(*) from test;
+----------+
| count(*) |
+----------+
| 577946 |
+----------+
從庫:
>select count(*) from test;
+----------+
| count(*) |
+----------+
| 577946 |
+----------+
可以看到,從庫的數據被主庫覆蓋。
為什麼會出現這個原因呢,我們可以具體看一下加索引的過程:
# pt-online-schema-change --nocheck-replication-filters --recursion-method=none --charset=utf8 --alter "add index (user_id) " h=localhost,P=3306,u=root,p=123456,D=test,t=test --print --execute
No slaves found. See --recursion-method if host master has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `test`.`test`...
Creating new table...
CREATE TABLE `test`.`_test_new` (
`rec_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`user_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`goods_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`add_time` int(11) unsigned NOT NULL DEFAULT '0',
`is_attention` tinyint(1) NOT NULL DEFAULT '0',
`wid` tinyint(3) unsigned NOT NULL DEFAULT '1' COMMENT '倉id,預設為1表示中國倉',
PRIMARY KEY (`rec_id`),
KEY `goods_id` (`goods_id`),
KEY `is_attention` (`is_attention`)
) ENGINE=InnoDB AUTO_INCREMENT=1721748 DEFAULT CHARSET=utf8
Created new table test._test_new OK.
Altering new table...
ALTER TABLE `test`.`_test_new` add index (user_id)
Altered `test`.`_test_new` OK.
2016-02-01T18:11:48 Creating triggers...
CREATE TRIGGER `pt_osc_test_test_del` AFTER DELETE ON `test`.`test` FOR EACH ROW DELETE IGNORE FROM `test`.`_test_new` WHERE `test`.`_test_new`.`rec_id` <=> OLD.`rec_id`
CREATE TRIGGER `pt_osc_test_test_upd` AFTER UPDATE ON `test`.`test` FOR EACH ROW REPLACE INTO `test`.`_test_new` (`rec_id`, `user_id`, `goods_id`, `add_time`, `is_attention`, `wid`) VALUES (NEW.`rec_id`, NEW.`user_id`, NEW.`goods_id`, NEW.`add_time`, NEW.`is_attention`, NEW.`wid`)
CREATE TRIGGER `pt_osc_test_test_ins` AFTER INSERT ON `test`.`test` FOR EACH ROW REPLACE INTO `test`.`_test_new` (`rec_id`, `user_id`, `goods_id`, `add_time`, `is_attention`, `wid`) VALUES (NEW.`rec_id`, NEW.`user_id`, NEW.`goods_id`, NEW.`add_time`, NEW.`is_attention`, NEW.`wid`)
2016-02-01T18:11:48 Created triggers OK.
2016-02-01T18:11:48 Copying approximately 578405 rows...
INSERT LOW_PRIORITY IGNORE INTO `test`.`_test_new` (`rec_id`, `user_id`, `goods_id`, `add_time`, `is_attention`, `wid`) SELECT `rec_id`, `user_id`, `goods_id`, `add_time`, `is_attention`, `wid` FROM `test`.`test` FORCE INDEX(`PRIMARY`) WHERE ((`rec_id` >= ?)) AND ((`rec_id` <= ?)) LOCK IN SHARE MODE /*pt-online-schema-change 30910 copy nibble*/
SELECT /*!40001 SQL_NO_CACHE */ `rec_id` FROM `test`.`test` FORCE INDEX(`PRIMARY`) WHERE ((`rec_id` >= ?)) ORDER BY `rec_id` LIMIT ?, 2 /*next chunk boundary*/
2016-02-01T18:12:09 Copied rows OK.
2016-02-01T18:12:09 Swapping tables...
RENAME TABLE `test`.`test` TO `test`.`_test_old`, `test`.`_test_new` TO `test`.`test`
2016-02-01T18:12:09 Swapped original and new tables OK.
2016-02-01T18:12:09 Dropping old table...
DROP TABLE IF EXISTS `test`.`_test_old`
2016-02-01T18:12:09 Dropped old table `test`.`_test_old` OK.
2016-02-01T18:12:09 Dropping triggers...
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_test_del`;
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_test_upd`;
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_test_ins`;
2016-02-01T18:12:09 Dropped triggers OK.
Successfully altered `test`.`test`.
在建索引的過程中,pt-osc會新將原表的數據拷貝到一張臨時表裡面,創建三個存儲過程來同步有變更的數據,先在臨時表上面加索引,加完索引後再將臨時表rename。
其實如果可以儘量只寫主庫不寫從庫,這樣可以避免很多坑。