优化在SQL Server上快速运行的运行速度较慢的SELECT查询

我试图在Oracle中运行以下sql语句,并运行以下时间：

SELECT orderID FROM tasks WHERE orderID NOT IN 
(SELECT DISTINCT orderID FROM tasks WHERE
 engineer1 IS NOT NULL AND engineer2 IS NOT NULL)

如果我只运行在IN子句中的子部分,那么在Oracle中运行非常快,即

SELECT DISTINCT orderID FROM tasks WHERE
engineer1 IS NOT NULL AND engineer2 IS NOT NULL

为什么整个语句在Oracle中花费这么长时间？在sql Server中,整个语句运行很快.

还有一个更简单/不同/更好的sql语句我应该使用？

有关问题的更多细节：

每个订单都是由许多任务组成的
>每个订单将被分配(其一个或多个任务将具有engineer1和engineer2设置),或者可以将该订单未分配(其所有任务对于工程师字段都具有空值)
>我试图找到所有未分配的orderID.

为了防止有任何差异,表中有〜120k行,每个订单有3个任务,所以〜40k不同的订单.

回答答案：

>我更喜欢在sql Server和Oracle中工作的sql语句.
>这些任务只有orderID和taskID的索引.
>我尝试了NOT EXISTS版本的声明,但是在我取消之前跑了超过3分钟.可能需要一个JOIN版本的声明？
>还有一个“订单”表以及orderID列.但是我试图通过不将它包含在原始sql语句中来简化问题.

我猜,在原来的sql语句中,子查询每次都运行在sql语句的第一部分的每一行 – 即使它是静态的,只需要运行一次？

执行

ANALYZE TABLE tasks COMPUTE STATISTICS;

使我的原始sql语句执行得更快.

虽然我仍然好奇为什么我要这样做,如果/当我需要再次运行它？

@H_301_37@

The statistics give Oracle’s
cost-based optimzer information that
it needs to determine the efficiency
of different execution plans: for
example,the number of rowsin a table,
the average width of rows,highest and
lowest values per column,number of
distinct values per column,clustering
factor of indexes etc.

In a small database you can just setup
a job to gather statistics every night
and leave it alone. In fact,this is
the default under 10g. For larger
implementations you usually have to
weigh the stability of the execution
plans against the way that the data
changes,which is a tricky balance.

Oracle also has a feature called
“dynamic sampling” that is used to
sample tables to determine relevant
statistics at execution time. It’s
much more often used with data
warehouses where the overhead of the
sampling it outweighed by the
potential performance increase for a
long-running query.

解决方法

通常,如果您分析所涉及的表格,这种类型的问题就会消失(因此Oracle对数据的分发有更好的了解)

ANALYZE TABLE tasks COMPUTE STATISTICS;

优化在SQL Server上快速运行的运行速度较慢的SELECT查询

解决方法

猜你在找的MsSQL相关文章