sql-server – 使用带有FIRST_VALUE和LAST_VALUE的GROUP BY

前端之家收集整理的这篇文章主要介绍了sql-server – 使用带有FIRST_VALUE和LAST_VALUE的GROUP BY前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我正在处理一些当前以1分钟为间隔存储的数据,如下所示:
CREATE TABLE #MinuteData
    (
      [Id] INT,[MinuteBar] DATETIME,[Open] NUMERIC(12,6),[High] NUMERIC(12,[Low] NUMERIC(12,[Close] NUMERIC(12,6)
    );

INSERT  INTO #MinuteData
        ( [Id],[MinuteBar],[Open],[High],[Low],[Close] )
VALUES  ( 1,'2015-01-01 17:00:00',1.557870,1.557880,1.557880 ),( 2,'2015-01-01 17:01:00',1.557900,( 3,'2015-01-01 17:02:00',1.557960,1.558070,1.558040 ),( 4,'2015-01-01 17:03:00',1.558080,1.558100,1.558040,1.558050 ),( 5,'2015-01-01 17:04:00',1.558050,1.558020,1.558030 ),( 6,'2015-01-01 17:05:00',1.558580,1.558710,1.557950 ),( 7,'2015-01-01 17:06:00',1.557910,1.558120,1.557990 ),( 8,'2015-01-01 17:07:00',1.557940,1.558250,1.558170 ),( 9,'2015-01-01 17:08:00',1.558140,1.558200,1.558120 ),( 10,'2015-01-01 17:09:00',1.558110,1.557970,1.557970 );

SELECT  *
FROM    #MinuteData;

DROP TABLE #MinuteData;

这些值跟踪货币汇率,因此对于每分钟间隔(条形),分钟开始时有未平仓价格,分钟结束时有收盘价格.高值和低值表示每个分钟期间的最高和最低速率.

期望的输出

我想要将这些数据重新格式化为5分钟,以产生以下输出

MinuteBar                Open       Close       Low         High
2015-01-01 17:00:00.000  1.557870   1.558030    1.557870    1.558100
2015-01-01 17:05:00.000  1.558580   1.557970    1.557870    1.558710

这取5的第一分钟的开放值,即5的最后一分钟的关闭值.高和低值表示5分钟时段内的最高和最低低速率.

当前解决方

我有一个解决方案,这样做(下面),但它感觉不优雅,因为它依赖于id值和自连接.此外,我打算在更大的数据集上运行它,所以我希望在可能的情况下以更有效的方式执行它:

-- Create a column to allow grouping in 5 minute Intervals
SELECT  Id,MinuteBar,High,Low,[Close],DATEDIFF(MINUTE,'2015-01-01T00:00:00',MinuteBar)/5 AS Interval
INTO    #5MinuteData
FROM    #MinuteData
ORDER BY minutebar

-- Group by inteval and aggregate prior to self join
SELECT  Interval,MIN(MinuteBar) AS MinuteBar,MIN(Id) AS OpenId,MAX(Id) AS CloseId,MIN(Low) AS Low,MAX(High) AS High
INTO    #DataMinMax
FROM    #5MinuteData
GROUP BY Interval;

-- Self join to get the Open and Close values
SELECT  t1.Interval,t1.MinuteBar,tOpen.[Open],tClose.[Close],t1.Low,t1.High
FROM    #DataMinMax t1
        INNER JOIN #5MinuteData tOpen ON tOpen.Id = OpenId
        INNER JOIN #5MinuteData tClose ON tClose.Id = CloseId;

DROP TABLE #DataMinMax
DROP TABLE #5MinuteData

返工尝试

而不是上面的查询,我一直在寻找使用FIRST_VALUELAST_VALUE,因为它似乎是我所追求的,但我无法让它与我正在进行的分组工作.可能有比我正在尝试做的更好的解决方案,所以我愿意接受建议.目前我正在尝试这样做:

SELECT  MIN(MinuteBar) MinuteBar5,FIRST_VALUE([Open]) OVER (ORDER BY MinuteBar) AS opening,MAX(High) AS High,LAST_VALUE([Close]) OVER (ORDER BY MinuteBar) AS Closing,'2015-01-01 00:00:00',MinuteBar) / 5 AS Interval
FROM    #MinuteData
GROUP BY DATEDIFF(MINUTE,MinuteBar) / 5

这给了我以下错误,如果删除这些行,则查询运行时会出现FIRST_VALUE和LAST_VALUE:

Column ‘#MinuteData.MinuteBar’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

解决方法

SELECT 
    MIN(MinuteBar) AS MinuteBar5,opening,Closing,Interval
FROM 
(
    SELECT FIRST_VALUE([Open]) OVER (PARTITION BY DATEDIFF(MINUTE,MinuteBar) / 5 ORDER BY MinuteBar) AS opening,FIRST_VALUE([Close]) OVER (PARTITION BY DATEDIFF(MINUTE,MinuteBar) / 5 ORDER BY MinuteBar DESC) AS Closing,MinuteBar) / 5 AS Interval,*
    FROM #MinuteData
) AS T
GROUP BY Interval,Closing

接近当前解决方案的解决方案.有两个地方你做错了.

> FIRST_VALUE和LAST_VALUE是分析函数,它们在窗口或分区而不是组上工作.您可以单独运行嵌套查询并查看其结果.
> LAST_VALUE是当前窗口的最后一个值,在查询中未指定,默认窗口是从当前分区的第一行到当前行的行.您可以将FIRST_VALUE与deseeding order一起使用,也可以指定一个窗口

LAST_VALUE([Close]) OVER (PARTITION BY DATEDIFF(MINUTE,MinuteBar) / 5 
            ORDER BY MinuteBar 
            ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Closing,

猜你在找的MsSQL相关文章