Why would the Parallelism (Repartition Streams) Operator Reduce Row Estimates to 1?

Query plans with bitmap filters can sometimes be tricky to read. From the BOL article for repartition streams (emphasis mine):

The Repartition Streams operator consumes multiple streams and produces multiple streams of records. The record contents and format are not changed. If the query optimizer uses a bitmap filter, the number of rows in the output stream is reduced.

In addition, an article on bitmap filters is also helpful:

When analyzing an execution plan containing bitmap filtering, it is important to understand how the data flows through the plan and where filtering is applied. The bitmap filter and optimized bitmap is created on the build input (the dimension table) side of a hash join; however, the actual filtering is typically done within the Parallelism operator, which is on the probe input (the fact table) side of the hash join. However, when the bitmap filter is based on an integer column, the filter can be applied directly to the initial table or index scan operation rather than the Parallelism operator. This technique is called in-row optimization.

I believe that's what you're observing with your query. It is possible to come up with a relatively simple demo to show a repartition streams operator reducing a cardinality estimate, even when the bitmap operator is IN_ROW against the fact table. Data prep:

create table outer_tbl (ID BIGINT NOT NULL);

INSERT INTO outer_tbl WITH (TABLOCK)
SELECT TOP (1000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM master..spt_values;

create table inner_tbl_1 (ID BIGINT NULL);
create table inner_tbl_2 (ID BIGINT NULL);

INSERT INTO inner_tbl_1 WITH (TABLOCK)
SELECT (ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) / 2000000 - 2) NUM
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

INSERT INTO inner_tbl_2 WITH (TABLOCK)
SELECT (ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) / 2000000 - 2) NUM
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

Here is a query that you should not run:

SELECT *
FROM outer_tbl o
INNER JOIN inner_tbl_1 i ON o.ID = i.ID
INNER JOIN inner_tbl_2 i2 ON o.ID = i2.ID
OPTION (HASH JOIN, QUERYTRACEON 9481, QUERYTRACEON 8649);

I uploaded the plan. Take a look at the operator near inner_tbl_2:

repartition losing rows

You may also find the second test in Hash Joins on Nullable Columns by Paul White helpful.

There are some inconsistencies in how the row reduction is applied. I was only able to see it in a plan with at least three tables. However, the reduction in expected rows seems reasonable with the right data distribution. Suppose that the joined column in the fact table has many repeated values that aren't present in the dimension table. A bitmap filter might eliminate those rows before they reach the join. For your query the estimate is reduced all the way to 1. How the rows are distributed amongst the hash function provides a good hint:

row distro

Based on that I suspect that you have a lot of repeated values for the Object1.Column21 column. If the repeated columns happen to not be in the stats histogram for Object4.Column19 then SQL Server could get the cardinality estimate very wrong.

I think that you should be concerned in that it might be possible to improve the performance of the query. Of course, if the query meets response time or SLA requirements then it may not be worth further investigation. However, if you do wish to investigate further there are a few things that you can do (other than updating stats) to get an idea on if the query optimizer would pick a better plan if it had better information. You could put the results of the join between Database1.Schema1.Object10 and Database1.Schema1.Object11 into a temp table and see if you continue to get nested loop joins. You could change that join to a LEFT OUTER JOIN so the query optimizer won't reduce the number of rows at that step. You could add a MAXDOP 1 hint to your query to see what happens. You could use TOP along with a derived table to force the join to go last, or you could even comment out the join from the query. Hopefully these suggestions are enough to get you started.

Regarding the connect item in the question, it is extremely unlikely that it is related to your question. That issue doesn't have to do with poor row estimates. It has to do with a race condition in parallelism that causes too many rows to be processed in the query plan behind the scenes. Here it looks like your query isn't doing any extra work.

The core problem here is a poor cardinality estimation for the result of the first join. This might arise for many reasons, but most frequently it is either out of date statistics or a number of correlated join predicates, which the optimizer's default model assumes are independent.

In the latter case, FIX: Poor performance when you run a query that contains correlated AND predicates in SQL Server 2008 or in SQL Server 2008 R2 or in SQL Server 2012 may be relevant using supported trace flag 4137. You might also try the query with trace flag 4199 to enable optimizer fixes, and/or 2301 to enable modelling extensions. It's hard to know on the basis of an anonymized plan.

The presence of the bitmap doesn't directly affect the cardinality estimate of the join, but it does make its effect visible sooner by applying early semijoin reduction. Without the bitmap, the cardinality estimate for the first join would be the same, and the remainder of the plan would still be optimized accordingly.

If you're curious, on a test system, you can disable bitmaps for the query with trace flag 7498. You can also disable optimized bitmaps (considered by the optimizer and affecting cardinality estimates), replacing them with post-optimization bitmaps (not considered by the optimizer, no effect on cardinality) with a combination of trace flags 7497 and 7498. Neither is documented or supported for use on a production system, but they do produce plans that the optimizer could consider normally, and so can be forced with a plan guide.

None of this will solve the core problem of the poor estimate for the first join as noted above, so I am really just mentioning it for interests' sake.

Further reading on bitmaps and hash joins:

Bitmap Magic (or… how SQL Server uses bitmap filters) by me
Parallel Hash Join by Craig Freedman
Query Execution Bitmaps by The SQL Server Query Processing Team
Interpreting Execution Plans Containing Bitmap Filters in Books Online
Understanding Hash Joins in Books Online

Why would the Parallelism (Repartition Streams) Operator Reduce Row Estimates to 1?

Tags:

Sql Server

Sql Server 2012

Execution Plan

Cardinality Estimates

Related

Recent Posts