Mysql重复数据去重保留一条数据

tracer-dhy 2021-11-16 原文

mysql 重复数据去重，保留一条

创建一张测试表

create table poi 
(
id bigint(20) NOT NULL AUTO_INCREMENT COMMENT \'id\',
poi_id bigint(20) NOT NULL COMMENT \'poi_id\',
 PRIMARY KEY (`id`)
);

插入测试数据

insert into poi (poi_id) values (10),(10),(10),(20),(20),(30),(40);

查找重复的poi_id

poi_id=10 重复了三条，poi_id=20重复了两条

select poi_id,count(poi_id) from poi group by poi_id having count(poi_id) > 1;

将重复的数据删除但是要保留一条

1、查找需要删除的数据

select * from poi where poi_id in
(
select poi_id from poi group by poi_id having count(poi_id) >1  
)
and id not in (select min(id) from poi group by poi_id having count(poi_id)>1);

分析:

在重复的poi_id集合中，保留每个重复poi_id中id最小的那条记录

1、找出重复的POI_ID集合

poi_id in (select poi_id from poi group by poi_id having count(poi_id) >1)    表示找出重复的poi_id

2、在重复的poi_id集合中，按照poi_id分组，保留每个重复的poi_id的最小id

id not in(select min(id) from poi group by poi_id having count(poi_id)>1)  表示在重复的poi_id集合中，按照poi_id分组，保留每个重复的poi_i

2、执行删除

如果执行将查询到的数据作为删除的条件，mysql会提示 “You can\’t specify target table \’poi\’ for update in FROM clause”

delete from poi where id in 
(
select id from poi where poi_id in
(
select poi_id from poi group by poi_id having count(poi_id) >1
)
and id not in (select min(id) from poi group by poi_id having count(poi_id)>1)
);

遇到这种情况，我们需要将子查询条件再包一层就可以了

delete from poi where id in 
(
select id from (
select * from poi where poi_id in
(
select poi_id from poi group by poi_id having count(poi_id) >1  
)
and id not in (select min(id) from poi group by poi_id having count(poi_id)>1)
) a
);