复制 → 改表名 → 执行,快速定位数据问题

层级 | 核心校验 | 原因 |
|---|---|---|
ODS | 完整性 + 时效性 + 波动性 | 源头数据不可控,第一时间发现问题 |
DWD | 唯一性 + 准确性 + 一致性 | 加工逻辑最复杂,最容易出错 |
DWS | 一致性 + 波动性 | 聚合结果要和明细对得上 |
ADS | 全量复检 | 面向业务方,交付前最后一道防线 |
场景:字段缺失、分区缺失、记录丢失
场景 | 适用层级 | 校验逻辑 | 严重程度 |
|---|---|---|---|
关键字段为空 | 全层级 | 主键/核心度量字段不允许 NULL | 阻断 |
批量字段空值率超标 | ODS/DWD | 空值率超过阈值 | 告警 |
分区无数据 | ODS | 当日分区行数为 0 | 阻断 |
记录数低于下限 | ODS/DWD | 行数低于历史最低值的某个比例 | 告警 |
SELECT
SUM(CASEWHEN user_id ISNULLTHEN1ELSE0END) AS user_id_null,
SUM(CASEWHEN order_id ISNULLTHEN1ELSE0END) AS order_id_null,
SUM(CASEWHEN amount ISNULLTHEN1ELSE0END) AS amount_null
FROM dwd_order
WHERE dt = '${bizdate}';
-- 任一字段 > 0 则阻断
SELECT
COUNT(1) AS total,
ROUND((COUNT(1) - COUNT(phone)) / COUNT(1) * 100, 2) AS phone_null_pct,
ROUND((COUNT(1) - COUNT(address)) / COUNT(1) * 100, 2) AS address_null_pct,
ROUND((COUNT(1) - COUNT(city)) / COUNT(1) * 100, 2) AS city_null_pct
FROM ods_user_info
WHERE dt = '${bizdate}';
-- 超过设定阈值(如 phone > 5%)则告警
SELECT
CASEWHEN cnt = 0THEN'BLOCK'ELSE'PASS'ENDASresult
FROM (
SELECTCOUNT(1) AS cnt
FROM ods_order
WHERE dt = '${bizdate}'
) t;
场景:ETL 重跑导致重复、JOIN 膨胀、上游重复推送
场景 | 适用层级 | 校验逻辑 | 严重程度 |
|---|---|---|---|
主键重复 | DWD | 业务主键 COUNT > 1 | 阻断 |
联合键重复 | DWD/DWS | 多字段组合去重 | 阻断 |
整行重复 | ODS | 完全相同记录出现多次 | 告警 |
SELECT pk_col, COUNT(1) AS dup_count
FROM dwd_order
WHERE dt = '${bizdate}'
GROUP BY pk_col
HAVING COUNT(1) > 1
LIMIT 10;
-- 有结果则阻断
SELECT
COUNT(1) AS total,
COUNT(DISTINCT order_id) AS distinct_cnt,
ROUND((COUNT(1) - COUNT(DISTINCT order_id)) / COUNT(1) * 100, 4) AS dup_rate_pct
FROM dwd_order
WHERE dt = '${bizdate}';
-- dup_rate_pct > 0 需排查
SELECT
CASEWHEN after_cnt > before_cnt * 1.05THEN'WARN'ELSE'PASS'ENDASresult,
before_cnt,
after_cnt
FROM (
SELECT
(SELECTCOUNT(1) FROM ods_order WHERE dt = '${bizdate}') AS before_cnt,
(SELECTCOUNT(1) FROM dwd_order WHERE dt = '${bizdate}') AS after_cnt
) t;
场景:枚举值越界、数值异常、逻辑矛盾
场景 | 适用层级 | 校验逻辑 | 严重程度 |
|---|---|---|---|
枚举值不在合法范围 | DWD | NOT IN 合法值列表 | 告警 |
数值超出业务边界 | DWD/DWS | 金额为负/年龄超150等 | 告警 |
字段间逻辑矛盾 | DWD | 结束时间 < 开始时间 | 阻断 |
比率类越界 | DWS/ADS | 转化率 > 100% 或 < 0 | 阻断 |
SELECT status, COUNT(1) AS cnt
FROM dwd_order
WHERE dt = '${bizdate}'
AND status NOT IN ('created', 'paid', 'shipped', 'completed', 'cancelled')
GROUP BY status;
-- 有结果则告警
SELECT
COUNT(CASEWHEN amount < 0THEN1END) AS negative_amt,
COUNT(CASEWHEN amount > 10000000THEN1END) AS extreme_amt,
COUNT(CASEWHEN age < 0OR age > 150THEN1END) AS invalid_age
FROM dwd_order_detail
WHERE dt = '${bizdate}';
SELECT COUNT(1) AS conflict_count
FROM dwd_event
WHERE dt = '${bizdate}'
AND end_time < start_time;
-- > 0 则阻断
SELECT COUNT(1) AS invalid_count
FROM ads_funnel
WHERE dt = '${bizdate}'
AND (convert_rate > 1 OR convert_rate < 0);
场景:跨层对账、跨系统核对、维表关联缺失
场景 | 适用层级 | 校验逻辑 | 严重程度 |
|---|---|---|---|
上下游行数差异 | DWD→DWS | 明细聚合后 vs 汇总表 | 告警 |
上下游金额差异 | 全层级 | SUM 对比,差值超阈值 | 阻断 |
维表关联失败 | DWD | LEFT JOIN 后外键为空 | 告警 |
明细与汇总不一致 | DWS/ADS | 重新聚合 vs 已有结果 | 阻断 |
SELECT
src.cnt AS dwd_count,
tgt.cnt AS dws_count,
src.cnt - tgt.cnt AS diff
FROM
(SELECT COUNT(DISTINCT order_id) AS cnt FROM dwd_order WHERE dt = '${bizdate}') src,
(SELECT SUM(order_cnt) AS cnt FROM dws_shop_daily WHERE dt = '${bizdate}') tgt;
-- diff 不为 0 需排查
SELECT
ABS(a.total_amt - b.total_amt) AS amt_diff,
CASE
WHENABS(a.total_amt - b.total_amt) / a.total_amt > 0.001THEN'BLOCK'
ELSE'PASS'
ENDASresult
FROM
(SELECTSUM(amount) AS total_amt FROM dwd_order WHERE dt = '${bizdate}') a,
(SELECTSUM(order_amount) AS total_amt FROM dws_shop_daily WHERE dt = '${bizdate}') b;
SELECT
COUNT(1) AS total,
SUM(CASEWHEN b.user_name ISNULLTHEN1ELSE0END) AS miss_count,
ROUND(SUM(CASEWHEN b.user_name ISNULLTHEN1ELSE0END) / COUNT(1) * 100, 2) AS miss_rate_pct
FROM dwd_order a
LEFTJOIN dim_user b ON a.user_id = b.user_id
WHERE a.dt = '${bizdate}';
-- miss_rate_pct > 5% 则告警
SELECT
a.shop_id,
a.recalc_amt,
b.order_amount,
ABS(a.recalc_amt - b.order_amount) AS diff
FROM (
SELECT shop_id, SUM(amount) AS recalc_amt
FROM dwd_order
WHERE dt = '${bizdate}'
GROUP BY shop_id
) a
JOIN dws_shop_daily b ON a.shop_id = b.shop_id AND b.dt = '${bizdate}'
WHERE ABS(a.recalc_amt - b.order_amount) > 0.01;
场景:上游延迟、分区未产出、数据滞后
场景 | 适用层级 | 校验逻辑 | 严重程度 |
|---|---|---|---|
分区未按时产出 | ODS | 指定时间检查分区是否存在 | 阻断 |
数据事件时间滞后 | ODS/DWD | 最新事件时间远早于当前时间 | 告警 |
依赖表未就绪 | 全层级 | 上游分区检测 | 阻断 |
SELECT
MAX(event_time) AS latest_event,
CURRENT_TIMESTAMP() AS check_time,
(UNIX_TIMESTAMP(CURRENT_TIMESTAMP()) - UNIX_TIMESTAMP(MAX(event_time))) / 60 AS delay_minutes
FROM ods_event_log
WHERE dt = '${bizdate}';
-- delay_minutes > 阈值则告警
-- Hive
SHOW PARTITIONS table_name PARTITION (dt = '${bizdate}');
-- 无结果则阻断
SELECT
MAX(update_time) AS latest_sync,
CASE
WHEN MAX(update_time) < '${bizdate} 23:00:00' THEN 'INCOMPLETE'
ELSE 'PASS'
END AS result
FROM ods_order
WHERE dt = '${bizdate}';
场景:数据量突增突降、指标异常波动
场景 | 适用层级 | 校验逻辑 | 严重程度 |
|---|---|---|---|
日环比行数波动 | 全层级 | 与昨日对比超 ±30% | 告警 |
周同比波动 | 全层级 | 与上周同天对比 | 告警 |
核心指标突变 | DWS/ADS | 关键业务指标环比 | 阻断 |
WITH daily AS (
SELECT dt, COUNT(1) AS cnt
FROM dwd_order
WHERE dt BETWEENDATE_SUB('${bizdate}', 1) AND'${bizdate}'
GROUPBY dt
)
SELECT
t.cnt AS today,
y.cnt AS yesterday,
ROUND((t.cnt - y.cnt) / y.cnt * 100, 2) AS change_pct,
CASE
WHENABS((t.cnt - y.cnt) / y.cnt) > 0.3THEN'WARN'
ELSE'PASS'
ENDASresult
FROM daily t
JOIN daily y ON t.dt = DATE_ADD(y.dt, 1)
WHERE t.dt = '${bizdate}';
SELECT
a.cnt AS this_week,
b.cnt AS last_week,
ROUND((a.cnt - b.cnt) / b.cnt * 100, 2) AS wow_pct
FROM
(SELECT COUNT(1) AS cnt FROM dwd_order WHERE dt = '${bizdate}') a,
(SELECT COUNT(1) AS cnt FROM dwd_order WHERE dt = DATE_SUB('${bizdate}', 7)) b;
WITH metrics AS (
SELECT dt, SUM(amount) AS gmv
FROM dws_shop_daily
WHERE dt BETWEENDATE_SUB('${bizdate}', 7) AND'${bizdate}'
GROUPBY dt
)
SELECT
curr.dt,
curr.gmv,
prev.gmv AS prev_day_gmv,
ROUND((curr.gmv - prev.gmv) / prev.gmv * 100, 2) AS change_pct
FROM metrics curr
JOIN metrics prev ON curr.dt = DATE_ADD(prev.dt, 1)
WHERE curr.dt = '${bizdate}'
ANDABS((curr.gmv - prev.gmv) / prev.gmv) > 0.2;
-- 有结果则告警
校验项 | 建议阈值 | 处理方式 |
|---|---|---|
主键重复 | = 0 | 阻断 |
关键字段空值 | = 0 | 阻断 |
上下游金额差异 | < 0.1% | 阻断 |
行数日环比 | ±30% | 阻断 |
非关键字段空值率 | < 10% | 告警 |
维表关联缺失率 | < 5% | 告警 |
数据延迟 | < 60min | 告警 |