我有一个表,消息在发生时存储.通常有一条消息’A’,有时A被单个消息’B’分隔.
现在我想对值进行分组,以便我能够分析它们,例如找到最长的’A’条纹或’A’条纹的分布.
现在我想对值进行分组,以便我能够分析它们,例如找到最长的’A’条纹或’A’条纹的分布.
我已经尝试了COUNT-OVER查询,但是仍在为每条消息计算.
SELECT message,COUNT(*) OVER (ORDER BY Timestamp RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
这是我的示例数据:
Timestamp Message 20150329 00:00 A 20150329 00:01 A 20150329 00:02 B 20150329 00:03 A 20150329 00:04 A 20150329 00:05 A 20150329 00:06 B
我想要关注输出
Message COUNT A 2 B 1 A 3 B 1
解决方法
那很有趣:)
;WITH cte as ( SELECT Messages.Message,Timestamp,ROW_NUMBER() OVER(PARTITION BY Message ORDER BY Timestamp) AS gn,ROW_NUMBER() OVER (ORDER BY Timestamp) AS rn FROM Messages ),cte2 AS ( SELECT Message,gn,rn,gn - rn as gb FROM cte ),cte3 AS ( SELECT Message,MIN(Timestamp) As Ts,COUNT(1) as Cnt FROM cte2 GROUP BY Message,gb) SELECT Message,Cnt FROM cte3 ORDER BY Ts
这是结果集:
Message Cnt A 2 B 1 A 3 B 1
查询可能会更短,但我会以这种方式发布,以便您可以看到正在发生的事情.
结果完全按照要求.这是最重要的部分,我们的想法是对每个分区中的行进行编号,同时对整个集合中的行进行编号,然后如果从另一个中减去一个,则会得到每个分区的“等级”.组.
;WITH cte as ( SELECT Messages.Message,gn - rn as gb FROM cte ) SELECT * FROM cte2 Message Timestamp gn rn gb A 2015-03-29 00:00:00.000 1 1 0 A 2015-03-29 00:01:00.000 2 2 0 B 2015-03-29 00:02:00.000 1 3 -2 A 2015-03-29 00:03:00.000 3 4 -1 A 2015-03-29 00:04:00.000 4 5 -1 A 2015-03-29 00:05:00.000 5 6 -1 B 2015-03-29 00:06:00.000 2 7 -5