在将记录合并为一个时,删除重复项的最佳方法是什么?
我有一种情况,表跟踪播放器名称和他们的记录,如下所示:
stats ------------------------------- nick totalgames wins ... John 100 40 john 200 97 Whistle 50 47 wHiStLe 75 72 ...
我需要合并缺口重复的行(当忽略大小写时)并将记录合并为一个,如下所示:
stats ------------------------------- nick totalgames wins ... john 300 137 whistle 125 119 ...
我在Postgres做这个.最好的方法是什么?
我知道通过这样做,我可以获得存在重复项的名称:
select lower(nick) as nick,totalgames,count(*) from stats group by lower(nick),totalgames having count(*) > 1;
我想到了这样的事情:
update stats set totalgames = totalgames + s.totalgames from (that query up there) s where lower(nick) = s.nick
解决方法
SQL Fiddle
这是你的更新:
UPDATE stats SET totalgames = x.games,wins = x.wins FROM (SELECT LOWER(nick) AS nick,SUM(totalgames) AS games,SUM(wins) AS wins FROM stats GROUP BY LOWER(nick) ) AS x WHERE LOWER(stats.nick) = x.nick;
DELETE FROM stats USING stats s2 WHERE lower(stats.nick) = lower(s2.nick) AND stats.nick < s2.nick;
(请注意,’update … from’和’delete … using’语法是Postgres特有的,并且从this answer和this answer被无耻地窃取.)
您可能还希望运行此命令以包含所有名称:
UPDATE STATS SET nick = lower(nick);
Aaaand在’nick’的小写版本上抛出一个唯一索引(或者向该列添加一个约束以禁止非小写值):
CREATE UNIQUE INDEX ON stats (LOWER(nick));