我试图在这个简化的例子中找到喜欢同一组电视节目的用户对
假设我有一张桌子,每个用户都可以获得他们喜欢的每个电视节目的参赛作品:
|USER | Show |
|-----|-------------|
|001 | Lost |
|001 | South Park |
|002 | Lost |
|003 | Lost |
|003 | South Park |
|004 | South Park |
|005 | Lost |
|006 | Lost |
然后我想得到一个结果:
|USER1 |USER2 |
|------|------|
|001 |003 |
|003 |001 |
|002 |005 |
|002 |006 |
|005 |002 |
|005 |006 |
|006 |002 |
|006 |005 |
或者更好的版本是:
|USER1 |USER2 |
|------|------|
|001 |003 |
|002 |005 |
|002 |006 |
|005 |006 |
我一直在玩GROUP BY和JOIN,但我仍然找不到答案:(.
到目前为止,我发现使用了
SELECT s1.User as USER1,s2.User as USER2,s1.Show as Show
FROM Shows s1 JOIN (SELECT * FROM Shows) s2
ON s1.Shows=s2.Shows AND s1.User!=s2.User;
这产生了成对的用户和他们共同的展示.但我不知道从哪里开始.
最佳答案
如果您可以接受CSV而不是列表结果,则可以简单地将表分组两次:
SELECT GROUP_CONCAT(User) FROM (
SELECT User,GROUP_CONCAT(DISTINCT `Show` ORDER BY `Show` SEPARATOR 0x1e) AS s
FROM Shows
GROUP BY User
) t GROUP BY s
否则,您可以将以上子查询加入到自身:
SELECT DISTINCT LEAST(t.User,u.User) AS User1,GREATEST(t.User,u.User) AS User2
FROM (
SELECT User,GROUP_CONCAT(DISTINCT `Show` ORDER BY `Show` SEPARATOR 0x1e) AS s
FROM Shows
GROUP BY User
) t JOIN (
SELECT User,GROUP_CONCAT(DISTINCT `Show` ORDER BY `Show` SEPARATOR 0x1e) AS s
FROM Shows
GROUP BY User
) u USING (s)
WHERE t.User <> u.User
在sqlfiddle上看到它们.
当然,如果保证在Shows表中不存在重复(User,Show)对,则可以通过从GROUP_CONCAT()聚合中删除DISTINCT关键字来提高性能.