我的表格如下:
ADM_ID WH_PID WH_IN_DATETIME WH_OUT_DATETIME
我的规则是:
>如果WH_OUT_DATETIME在具有相同WH_P_ID的另一个ADM_ID的WH_IN_DATETIME的24小时内或之后24小时内
我想在结果中添加另一列,如果可能的话,将列表值标识为EP_ID.
例如
ADM_ID WH_PID WH_IN_DATETIME WH_OUT_DATETIME ------ ------ -------------- --------------- 1 9 2014-10-12 00:00:00 2014-10-13 15:00:00 2 9 2014-10-14 14:00:00 2014-10-15 15:00:00 3 9 2014-10-16 14:00:00 2014-10-17 15:00:00 4 9 2014-11-20 00:00:00 2014-11-21 00:00:00 5 5 2014-10-17 00:00:00 2014-10-18 00:00:00
将返回行:
ADM_ID WH_PID EP_ID EP_IN_DATETIME EP_OUT_DATETIME WH_IN_DATETIME WH_OUT_DATETIME ------ ------ ----- ------------------- ------------------- ------------------- ------------------- 1 9 1 2014-10-12 00:00:00 2014-10-17 15:00:00 2014-10-12 00:00:00 2014-10-13 15:00:00 2 9 1 2014-10-12 00:00:00 2014-10-17 15:00:00 2014-10-14 14:00:00 2014-10-15 15:00:00 3 9 1 2014-10-12 00:00:00 2014-10-17 15:00:00 2014-10-16 14:00:00 2014-10-17 15:00:00 4 9 2 2014-11-20 00:00:00 2014-11-20 00:00:00 2014-10-16 14:00:00 2014-11-21 00:00:00 5 5 1 2014-10-17 00:00:00 2014-10-18 00:00:00 2014-10-17 00:00:00 2014-10-18 00:00:00
EP_OUT_DATETIME将始终是组中的最新日期.希望这有点澄清.
通过这种方式,我可以按EP_ID进行分组,找到EP_OUT_DATETIME以及任何属于其中的ADM_ID / PID的开始时间.
每个都应该滚动到下一个,这意味着如果另一行的WH_IN_DATETIME在另一行的WH_OUT_DATETIME之后对于相同的WH_PID,则该行的WH_OUT_DATETIME变为该EP_ID内所有WH_PID的EP_OUT_DATETIME.
我希望这是有道理的.
谢谢,
先生
解决方法
步骤是:
>创建临时表
>使用所有ADM_ID值对临时表进行种子处理 – 这使我们可以执行更新,因为所有行都已存在.
>更新临时表
>最后,简单的选择将临时表连接到主表
测试设置
SET ANSI_NULLS ON; SET NOCOUNT ON; CREATE TABLE #Table ( ADM_ID INT NOT NULL PRIMARY KEY,WH_PID INT NOT NULL,WH_IN_DATETIME DATETIME NOT NULL,WH_OUT_DATETIME DATETIME NOT NULL ); INSERT INTO #Table VALUES (1,9,'2014-10-12 00:00:00','2014-10-13 15:00:00'); INSERT INTO #Table VALUES (2,'2014-10-14 14:00:00','2014-10-15 15:00:00'); INSERT INTO #Table VALUES (3,'2014-10-16 14:00:00','2014-10-17 15:00:00'); INSERT INTO #Table VALUES (4,'2014-11-20 00:00:00','2014-11-21 00:00:00'); INSERT INTO #Table VALUES (5,5,'2014-10-17 00:00:00','2014-10-18 00:00:00');
第1步:创建并填充临时表
CREATE TABLE #Scratch ( ADM_ID INT NOT NULL PRIMARY KEY,EP_ID INT NOT NULL -- Might need WH_PID and WH_IN_DATETIME fields to guarantee proper UPDATE ordering ); INSERT INTO #Scratch (ADM_ID,EP_ID) SELECT ADM_ID,0 FROM #Table;
备用临时表结构以确保正确的更新顺序(因为“古怪更新”使用聚集索引的顺序,如本答案底部所述):
CREATE TABLE #Scratch ( WH_PID INT NOT NULL,ADM_ID INT NOT NULL,EP_ID INT NOT NULL ); INSERT INTO #Scratch (WH_PID,WH_IN_DATETIME,ADM_ID,EP_ID) SELECT WH_PID,0 FROM #Table; CREATE UNIQUE CLUSTERED INDEX [CIX_Scratch] ON #Scratch (WH_PID,ADM_ID);
步骤2:使用局部变量更新Scratch Table以跟踪先前值
DECLARE @EP_ID INT; -- this is used in the UPDATE ;WITH cte AS ( SELECT TOP (100) PERCENT t1.*,t2.WH_OUT_DATETIME AS [PriorOut],t2.ADM_ID AS [PriorID],ROW_NUMBER() OVER (PARTITION BY t1.WH_PID ORDER BY t1.WH_IN_DATETIME) AS [RowNum] FROM #Table t1 LEFT JOIN #Table t2 ON t2.WH_PID = t1.WH_PID AND t2.ADM_ID <> t1.ADM_ID AND t2.WH_OUT_DATETIME >= (t1.WH_IN_DATETIME - 1) AND t2.WH_OUT_DATETIME < t1.WH_IN_DATETIME ORDER BY t1.WH_PID,t1.WH_IN_DATETIME ) UPDATE sc SET @EP_ID = sc.EP_ID = CASE WHEN cte.RowNum = 1 THEN 1 WHEN cte.[PriorOut] IS NULL THEN (@EP_ID + 1) ELSE @EP_ID END FROM #Scratch sc INNER JOIN cte ON cte.ADM_ID = sc.ADM_ID
第3步:选择加入划痕表
SELECT tab.ADM_ID,tab.WH_PID,sc.EP_ID,MIN(tab.WH_IN_DATETIME) OVER (PARTITION BY tab.WH_PID,sc.EP_ID) AS [EP_IN_DATETIME],MAX(tab.WH_OUT_DATETIME) OVER (PARTITION BY tab.WH_PID,sc.EP_ID) AS [EP_OUT_DATETIME],tab.WH_IN_DATETIME,tab.WH_OUT_DATETIME FROM #Table tab INNER JOIN #Scratch sc ON sc.ADM_ID = tab.ADM_ID ORDER BY tab.ADM_ID;
资源
寻找“@variable = column = expression”
> Performance Analysis of doing Running Totals(与此不完全相同,但不太远)
这篇博文确实提到:
> PRO:这种方法通常很快> CON:“UPDATE的顺序由聚集索引的顺序控制”.此行为可能会根据具体情况排除使用此方法.但在这种特殊情况下,如果WH_PID值至少不是通过聚簇索引的排序自然地组合在一起并由WH_IN_DATETIME排序,那么这两个字段只会被添加到临时表和PK(带有隐含的聚簇索引)上临时表变为(WH_PID,ADM_ID).