下面是我当前的查询:
SELECT DATEDIFF(created_at, '2020-07-01') DIV 6 period,
user_id FROM transactions
WHERE DATE(created_at) >= '2020-07-01'
GROUP BY user_id, DATEDIFF(created_at, '2020-07-01') DIV 6
ORDER BY period它返回每段时间至少有一个事务处理的用户列表(周期为=== 6天)。下面是一个简化的当前输出:
// res_table
+--------+---------+
| period | user_id |
+--------+---------+
| 0 | 1111 |
| 0 | 2222 |
| 0 | 3333 |
| 1 | 7777 |
| 1 | 1111 |
| 2 | 2222 |
| 2 | 1111 |
| 2 | 8888 |
| 2 | 3333 |
+--------+---------+现在,我需要知道,在哪一段时间里,有多少用户再次进行了至少一次交易(在营销方面,我正试图用队列图来描绘保留率)。因此,计算必须在笛卡尔算法中完成;就像一个self-join!
以下是的预期结果:
+---------+---------+------------+
| periodX | periodY | percentage |
+---------+---------+------------+
| 0 | 0 | 100% | -- it means 3 users exist in period 0 and logically all of them exist in period 0. So 3/3=100%
| 0 | 1 | 33% | -- It means 3 users exist in period 0, and just 1 of them exist in period 1. So 1/3=33%
| 0 | 2 | 66% | -- It means 3 user exists in period 0, and just 2 of them exist in period 2. So 2/3=66%
| 1 | 1 | 100% | -- it means 1 user (only #777, actually #111 is ignored because it's duplicated in pervious periods) exists in period 1 and logically it exists in period 1. So 1/1=100%
| 1 | 2 | 0% |
| 2 | 2 | 100% |
+---------+---------+------------+单纯使用MySQL可以做到这一点吗?
发布于 2021-04-19 10:59:42
您可以使用窗口函数:
SELECT first_period, period, COUNT(*),
COUNT(*) / SUM(COUNT(*)) OVER (PARTITION BY first_period) as ratio
FROM (SELECT DATEDIFF(created_at, '2020-07-01') DIV 6 period,
user_id,
MIN(MIN(DATEDIFF(created_at, '2020-07-01') DIV 6) OVER (PARTITION BY user_id)) as first_period
FROM transactions
WHERE DATE(created_at) >= '2020-07-01'
GROUP BY user_id, DATEDIFF(created_at, '2020-07-01') DIV 6
) u
GROUP BY first_period, period
ORDER BY first_period, period;这不包括缺少的句号。这是一个小问题,因为您需要枚举所有这些:
with periods as (
select 0 as period union all
select 1 as period union all
select 2 as period
)
select p1.period, p2.period, COUNT(u.user_id)
from periods p1 join
periods p2
on p1.period <= p2.period left join
(SELECT DATEDIFF(created_at, '2020-07-01') DIV 6 period,
user_id,
MIN(MIN(DATEDIFF(created_at, '2020-07-01') DIV 6) OVER (PARTITION BY user_id)) as first_period
FROM transactions
WHERE DATE(created_at) >= '2020-07-01'
GROUP BY user_id, DATEDIFF(created_at, '2020-07-01') DIV 6
) u
ON p1.period = u.first_period AND p2.period = u.period
GROUP BY p1.period, p2.period;https://stackoverflow.com/questions/67160680
复制相似问题