我有一个生成529,032行的查询,我将其存储在一个名为EFOT_DISTINCT_COMBOS的表中。
(这曾经是物化视图,但我们的服务器管理员建议我们使用普通表,并使用存储过程手动“刷新”查询中的数据。)
接下来我需要做的是CROSS JOIN,它遇到了一个很大的问题.服务器锁定并挂起,不做任何工作.(库缓存锁?)
因此,看起来我需要在查询中替换CROSS JOIN,并找到更好的方法来编写这个查询。
这个查询看起来像是噩梦,但实际上比看起来简单得多--它只是计算一些简单的条件总数。
(几个月前,我曾尝试使用窗口函数,但是窗口函数破坏了一些东西(我不记得确切的是什么),所以我不得不以某种原因恢复到简单的聚合。
我们正在CROSS JOINing的表称为SNAPSHOTS,它是一个非常短的表--只包含44行(它不会长得太大,最终可能最多增长80至100行)。
529,032 x 44 = 23,277,408行由此CROSS JOIN生成。
我试图通过在底部添加HAVING COUNT(ef.TIMESTAMP) > 0来减少这种占用,这可能会删除75%的结果(我不一定需要)。
这肯定会有所帮助,但查询仍然无法在服务器上执行。
SELECT
ss.TIMESTAMP,
ss.TIMESTAMP_DATE,
fc.FISCAL_QUARTER,
fc.LOB,
fc.FREQUENCY,
fc.EDGE_VP,
fc.EDGE_RM,
fc.EDGE_ASM,
NVL(SUM(ef.PIPELINE),0) AS PIPELINE,
NVL(SUM(ef.BEST),0) AS BEST,
NVL(SUM(ef.FORECAST),0) AS FORECAST,
NVL(SUM(ef.CLOSED),0) AS CLOSED,
CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PIPELINE_DIFF),0)
WHEN 'D' THEN NVL(SUM(ef.D_PIPELINE_DIFF),0)
WHEN 'W' THEN NVL(SUM(ef.W_PIPELINE_DIFF),0)
WHEN 'M' THEN NVL(SUM(ef.M_PIPELINE_DIFF),0)
WHEN 'Q' THEN NVL(SUM(ef.Q_PIPELINE_DIFF),0)
WHEN 'Y' THEN NVL(SUM(ef.Y_PIPELINE_DIFF),0) END AS PIPELINE_CHANGE,
CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.BEST_DIFF),0)
WHEN 'D' THEN NVL(SUM(ef.D_BEST_DIFF),0)
WHEN 'W' THEN NVL(SUM(ef.W_BEST_DIFF),0)
WHEN 'M' THEN NVL(SUM(ef.M_BEST_DIFF),0)
WHEN 'Q' THEN NVL(SUM(ef.Q_BEST_DIFF),0)
WHEN 'Y' THEN NVL(SUM(ef.Y_BEST_DIFF),0) END AS BEST_CHANGE,
CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.FORECAST_DIFF),0)
WHEN 'D' THEN NVL(SUM(ef.D_FORECAST_DIFF),0)
WHEN 'W' THEN NVL(SUM(ef.W_FORECAST_DIFF),0)
WHEN 'M' THEN NVL(SUM(ef.M_FORECAST_DIFF),0)
WHEN 'Q' THEN NVL(SUM(ef.Q_FORECAST_DIFF),0)
WHEN 'Y' THEN NVL(SUM(ef.Y_FORECAST_DIFF),0) END AS FORECAST_CHANGE,
CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.CLOSED_DIFF),0)
WHEN 'D' THEN NVL(SUM(ef.D_CLOSED_DIFF),0)
WHEN 'W' THEN NVL(SUM(ef.W_CLOSED_DIFF),0)
WHEN 'M' THEN NVL(SUM(ef.M_CLOSED_DIFF),0)
WHEN 'Q' THEN NVL(SUM(ef.Q_CLOSED_DIFF),0)
WHEN 'Y' THEN NVL(SUM(ef.Y_CLOSED_DIFF),0) END AS CLOSED_CHANGE,
CASE WHEN CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PREV_PIPELINE),0)
WHEN 'D' THEN NVL(SUM(ef.PREV_D_PIPELINE),0)
WHEN 'W' THEN NVL(SUM(ef.PREV_W_PIPELINE),0)
WHEN 'M' THEN NVL(SUM(ef.PREV_M_PIPELINE),0)
WHEN 'Q' THEN NVL(SUM(ef.PREV_Q_PIPELINE),0)
WHEN 'Y' THEN NVL(SUM(ef.PREV_Y_PIPELINE),0) END = 0 THEN 0
ELSE 100 * CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PIPELINE_DIFF),0)
WHEN 'D' THEN NVL(SUM(ef.D_PIPELINE_DIFF),0)
WHEN 'W' THEN NVL(SUM(ef.W_PIPELINE_DIFF),0)
WHEN 'M' THEN NVL(SUM(ef.M_PIPELINE_DIFF),0)
WHEN 'Q' THEN NVL(SUM(ef.Q_PIPELINE_DIFF),0)
WHEN 'Y' THEN NVL(SUM(ef.Y_PIPELINE_DIFF),0) END
/ CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PREV_PIPELINE),0)
WHEN 'D' THEN NVL(SUM(ef.PREV_D_PIPELINE),0)
WHEN 'W' THEN NVL(SUM(ef.PREV_W_PIPELINE),0)
WHEN 'M' THEN NVL(SUM(ef.PREV_M_PIPELINE),0)
WHEN 'Q' THEN NVL(SUM(ef.PREV_Q_PIPELINE),0)
WHEN 'Y' THEN NVL(SUM(ef.PREV_Y_PIPELINE),0) END END AS PIPELINE_PERCENT_CHANGE,
CASE WHEN CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PREV_BEST),0)
WHEN 'D' THEN NVL(SUM(ef.PREV_D_BEST),0)
WHEN 'W' THEN NVL(SUM(ef.PREV_W_BEST),0)
WHEN 'M' THEN NVL(SUM(ef.PREV_M_BEST),0)
WHEN 'Q' THEN NVL(SUM(ef.PREV_Q_BEST),0)
WHEN 'Y' THEN NVL(SUM(ef.PREV_Y_BEST),0) END = 0 THEN 0
ELSE 100 * CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.BEST_DIFF),0)
WHEN 'D' THEN NVL(SUM(ef.D_BEST_DIFF),0)
WHEN 'W' THEN NVL(SUM(ef.W_BEST_DIFF),0)
WHEN 'M' THEN NVL(SUM(ef.M_BEST_DIFF),0)
WHEN 'Q' THEN NVL(SUM(ef.Q_BEST_DIFF),0)
WHEN 'Y' THEN NVL(SUM(ef.Y_BEST_DIFF),0) END
/ CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PREV_BEST),0)
WHEN 'D' THEN NVL(SUM(ef.PREV_D_BEST),0)
WHEN 'W' THEN NVL(SUM(ef.PREV_W_BEST),0)
WHEN 'M' THEN NVL(SUM(ef.PREV_M_BEST),0)
WHEN 'Q' THEN NVL(SUM(ef.PREV_Q_BEST),0)
WHEN 'Y' THEN NVL(SUM(ef.PREV_Y_BEST),0) END END AS BEST_PERCENT_CHANGE,
CASE WHEN CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PREV_FORECAST),0)
WHEN 'D' THEN NVL(SUM(ef.PREV_D_FORECAST),0)
WHEN 'W' THEN NVL(SUM(ef.PREV_W_FORECAST),0)
WHEN 'M' THEN NVL(SUM(ef.PREV_M_FORECAST),0)
WHEN 'Q' THEN NVL(SUM(ef.PREV_Q_FORECAST),0)
WHEN 'Y' THEN NVL(SUM(ef.PREV_Y_FORECAST),0) END = 0 THEN 0
ELSE 100 * CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.FORECAST_DIFF),0)
WHEN 'D' THEN NVL(SUM(ef.D_FORECAST_DIFF),0)
WHEN 'W' THEN NVL(SUM(ef.W_FORECAST_DIFF),0)
WHEN 'M' THEN NVL(SUM(ef.M_FORECAST_DIFF),0)
WHEN 'Q' THEN NVL(SUM(ef.Q_FORECAST_DIFF),0)
WHEN 'Y' THEN NVL(SUM(ef.Y_FORECAST_DIFF),0) END
/ CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PREV_FORECAST),0)
WHEN 'D' THEN NVL(SUM(ef.PREV_D_FORECAST),0)
WHEN 'W' THEN NVL(SUM(ef.PREV_W_FORECAST),0)
WHEN 'M' THEN NVL(SUM(ef.PREV_M_FORECAST),0)
WHEN 'Q' THEN NVL(SUM(ef.PREV_Q_FORECAST),0)
WHEN 'Y' THEN NVL(SUM(ef.PREV_Y_FORECAST),0) END END AS FORECAST_PERCENT_CHANGE,
CASE WHEN CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PREV_CLOSED),0)
WHEN 'D' THEN NVL(SUM(ef.PREV_D_CLOSED),0)
WHEN 'W' THEN NVL(SUM(ef.PREV_W_CLOSED),0)
WHEN 'M' THEN NVL(SUM(ef.PREV_M_CLOSED),0)
WHEN 'Q' THEN NVL(SUM(ef.PREV_Q_CLOSED),0)
WHEN 'Y' THEN NVL(SUM(ef.PREV_Y_CLOSED),0) END = 0 THEN 0
ELSE 100 * CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.CLOSED_DIFF),0)
WHEN 'D' THEN NVL(SUM(ef.D_CLOSED_DIFF),0)
WHEN 'W' THEN NVL(SUM(ef.W_CLOSED_DIFF),0)
WHEN 'M' THEN NVL(SUM(ef.M_CLOSED_DIFF),0)
WHEN 'Q' THEN NVL(SUM(ef.Q_CLOSED_DIFF),0)
WHEN 'Y' THEN NVL(SUM(ef.Y_CLOSED_DIFF),0) END
/ CASE fc.FREQUENCY WHEN '%' THEN NVL(SUM(ef.PREV_CLOSED),0)
WHEN 'D' THEN NVL(SUM(ef.PREV_D_CLOSED),0)
WHEN 'W' THEN NVL(SUM(ef.PREV_W_CLOSED),0)
WHEN 'M' THEN NVL(SUM(ef.PREV_M_CLOSED),0)
WHEN 'Q' THEN NVL(SUM(ef.PREV_Q_CLOSED),0)
WHEN 'Y' THEN NVL(SUM(ef.PREV_Y_CLOSED),0) END END AS CLOSED_PERCENT_CHANGE -- ,
-- COUNT(ef.TIMESTAMP) AS DEAL_SS_COUNT
FROM EFOT_DISTINCT_COMBOS fc
CROSS JOIN SNAPSHOTS ss
LEFT JOIN EDGE_FORECAST_OVER_TIME ef
ON ef.TIMESTAMP = ss.TIMESTAMP
AND ef.fq_selection = NVL(fc.FISCAL_QUARTER,'%')
AND ef.lob_selection = NVL(fc.LOB,'%')
AND ((fc.FREQUENCY = 'D' AND ef.FREQUENCY in('D', 'W', 'M', 'Q', 'Y') AND ef.TIMESTAMP_DATE >= TRUNC(LOCALTIMESTAMP) - INTERVAL '1' MONTH)
OR (fc.FREQUENCY = 'W' AND TO_CHAR(ef.TIMESTAMP, 'fmDAY') = 'MONDAY' AND ef.TIMESTAMP_DATE >= TRUNC(LOCALTIMESTAMP) - INTERVAL '3' MONTH)
OR (fc.FREQUENCY = 'M' AND ef.FREQUENCY in('M', 'Q', 'Y') AND ef.TIMESTAMP_DATE >= TRUNC(LOCALTIMESTAMP) - INTERVAL '1' YEAR)
OR (fc.FREQUENCY = 'Q' AND ef.FREQUENCY in('Q', 'Y'))
OR ef.FREQUENCY LIKE NVL(fc.FREQUENCY,'%'))
AND ef.EDGE_VP LIKE NVL(fc.EDGE_VP,'%')
AND ef.EDGE_RM LIKE NVL(fc.EDGE_RM,'%')
AND ef.EDGE_ASM LIKE NVL(fc.EDGE_ASM,'%')
WHERE ss.TABLE_NAME = 'EDGE_FORECAST'
GROUP BY ss.TIMESTAMP ,
ss.TIMESTAMP_DATE ,
fc.FISCAL_QUARTER ,
fc.LOB ,
fc.FREQUENCY ,
fc.EDGE_VP ,
fc.EDGE_RM ,
fc.EDGE_ASM
HAVING COUNT(ef.TIMESTAMP) > 0;接下来,尝试将此查询拆分为两个查询。
CROSS JOIN并为总计列创建23,277,408行0。UPDATE的第二个查询全部为23,277,408行,并计算总列的值。第一次查询非常有效,大约在12分钟内执行。
但是第二个查询变得更加复杂,它运行了将近6个小时,最终失败了:
ORA-01555: snapshot too old。
有什么更好的方法可以写这个,这样服务器才能真正完成工作吗?
在我们添加所有的LOB过滤器之前,这个查询以前是工作的(从执行计划的复杂性来看,我猜有一个额外的过滤器将它推到了边缘)。
(PasteBin,因为这些查询对于StackOverflow来说太长了。)
发布于 2020-04-20 05:59:11
戈登·林诺夫评论道:
“昂贵的查询成本几乎总是由于数据移动--连接、聚合和排序。也有例外。例如,长文本字段或复杂的JSON可能会增加很大的开销,但它通常是数据移动。”
此外,Del评论说:
“快速查看几件事后,我认为您的查询太冗长了。您在ef表上做了一个左联接。但是您的HAVING子句是检查该表上的计数是否大于0。只需将其切换到内部联接即可。在此之后,如果ef表中需要列,则可以删除代码中的一些NVL函数。”
这两个评论都提到了连接,这可能是导致性能问题的一个原因。
我遵循Del的建议,删除了HAVING子句,并将LEFT JOIN转换为INNER JOIN。
我还在可能的地方删除了大多数NVL函数,或者将它们合并,使查询更简单。
只有这两个更改,查询才能在15分钟内成功执行,而不是运行几个小时,最终失败。
我敢打赌LEFT JOIN是问题的99%。
感谢大家花时间读了我所有荒谬的代码,并给了我提示!
https://stackoverflow.com/questions/61275399
复制相似问题