Here is the dummy data,这是一个呼叫记录数据表.
这是它的一瞥:
| call_id | customer | company | call_start |
|-----------|--------------|-------------|---------------------|
|1411482360 | 001143792042 | 08444599175 | 2014-07-31 13:55:03 |
|1476992122 | 001143792042 | 08441713191 | 2014-07-31 14:05:10 |
客户和公司字段代表他们的电话号码.
>要求是根据以下逻辑计算总“增益”和总“丢失”值:
编辑:
– 客户A致电公司A.
– 如果客户A致电B公司,那么B公司将获得1个收益,而A公司将有1个收益.
– 如果客户A致电公司C,则公司C将获得1个收益,而公司B将收取1个损失.
– 如果客户A再次致电公司C,则溢出/收益不会受到影响.
– 一旦客户A第二次拨打电话,收益/损失才会发挥作用.
– 如果客户按此顺序呼叫公司:A,B,C,A,D,则流程应如下所示:
A ->
B -> B +1 gain,A +1 lost
B ->
C -> C +1 gain,B +1 lost
A -> A +1 gain,C +1 lost
A ->
C -> C +1 gain,A +1 lost
B -> B +1 gain,C +1 lost
D -> D +1 gain,B +1 lost
经过上述过程后,我们应该将总值设为:
Company Total gain Total lost
A 1 2
B 2 2
C 2 2
D 1 0
我开始研究这个但是它错了,它只是一个想法,它不会根据上述条件给我单独增加的增益和丢失值:
DROP TABLE IF EXISTS GetTotalGainAndLost;
CREATE TEMPORARY TABLE IF NOT EXISTS GetTotalGainAndLost
AS
(
SELECT SUM(count) as 'TotalGainAndLost',`date`,DAY(`date`) as 'DAY'
FROM (SELECT count(*) as 'count',customer,`date`
FROM (SELECT customer,company,count(*) AS 'count',DATE_FORMAT(`call_end`,'%Y-%m-%d') as 'date'
FROM calls
WHERE `call_end` LIKE CONCAT(2014,'-',RIGHT(CAST(concat('0',01) AS CHAR),2),'-%')
GROUP BY customer,DAY(`call_end`) ORDER BY `call_end` ASC)
as tbl1 group by customer,`date` having count(*) > 1)
as tbl2 GROUP by `date`
);
Select * from GetTotalGainAndLost;
DROP TABLE GetTotalGainAndLost;
>所需的输出如下所示:
每个公司和日期应该是一行(总收益和一天中的丢失电话,例如1月)
| company | totalGain | totalLost | date | DAY |
|-------------|------------|-------------|--------------|-------|
| 08444599175 | 17 | 6 | 2014-07-01 | 1 |
| 08444599175 | 12 | 10 | 2014-07-02 | 2 |
| 08444599175 | 3 | 6 | 2014-07-02 | 3 |
| 08444599175 | .... | ... | ... | ... |
| 08444599175 | 7 | 6 | 2014-07-31 | 31 |
让我们将N表示为公司出现的次数.让我们尝试在三个简单的规则中简化公式.
>出现的第一家公司将获得N – 1收益,N亏损.
>中期公司将有N收益,N亏损.
>最后一家公司将获得N增益,N – 1损失
测试
在你的例子中:
>从公司A开始,它出现3次.
> B公司出现3次
>公司C出现2次
>以公司D结束,出现1次.
结果
Company Gain Lost
A 2 3
B 3 3
C 2 2
D 1 0
转换为sql
首先,我们首先计算每家公司的数量.
SELECT
company,COUNT(*) AS gain,COUNT(*) AS lost,DATE(call_start) AS date
FROM calls
GROUP BY DATE(call_start),company
然后,我们开始选择每个公司第一次出现在每个客户的号码.
SELECT company,-COUNT(*) AS gain,0 AS lost,DATE(call_start) AS `date`
FROM calls INNER JOIN (
SELECT MIN(call_id) AS call_id FROM calls GROUP BY DATE(call_start),customer
) AS t ON (calls.call_id = t.call_id)
GROUP BY DATE(call_start),calls.company
最后出现的公司数量.
SELECT company,0 AS gain,-COUNT(*) AS lost,DATE(call_start) AS `date`
FROM calls INNER JOIN (
SELECT MAX (call_id) AS call_id FROM calls GROUP BY DATE(call_start),calls.company
结合sql
最后,我们可以使用UNION ALL将整个sql组合在一起,然后再执行另一个组.
SELECT company,SUM(gain) AS gain,SUM(lost) AS lost,`date` FROM (
(
SELECT
company,DATE(call_start) AS `date`
FROM calls
GROUP BY DATE(call_start),company
) UNION ALL (
SELECT company,DATE(call_start) AS `date`
FROM calls INNER JOIN (
SELECT MIN(call_id) AS call_id FROM calls GROUP BY DATE(call_start),customer
) AS t ON (calls.call_id = t.call_id)
GROUP BY DATE(call_start),calls.company
) UNION ALL (
SELECT company,DATE(call_start) AS `date`
FROM calls INNER JOIN (
SELECT MAX(call_id) AS call_id FROM calls GROUP BY DATE(call_start),calls.company
)
) AS t
GROUP BY `date`,company
澄清
上述查询假设每个新的一天都是独立的.例如,
>客户A致电公司A(第1天)
>客户A致电公司B(第1天)B获得1,A输1
>客户A致电公司C(第1天)C获得1,B输了1
>客户A致电公司D(第2天)
>客户A呼叫公司E(第2天)E获得1,D输掉1
结果将是
COM G L DAY
----------------
A 0 1 1
B 1 1 1
C 1 0 1
D 0 1 2
E 1 0 2