Sep 23: Parallel Query and 11g
The main edits required to the scripts were simple ones to correct the locations of trace files, for example (where $ORACLE_BASE is /ora on the 10g system)
rm /ora/admin/TEST1020/udump/*.trc
and $ORACLE_BASE is /u01/app/oracle on the 11g system
rm /u01/app/oracle/diag/rdbms/test11/TEST11/trace/*.trc
They were always hard-coded because I never planned to re-run the tests much but, even if I had used $ORACLE_BASE/admin/$ORACLE_SID/udump (and bdump), which would have been the correct approach, the scripts still wouldn't have worked because of Oracle's decision to play around with OFA directory structures yet again! I can't help feeling this is going to cause more problems than expected because there are a lot of scripts on systems out there that depend on the existing locations.
Anyway, on to the tests themselves. I was keen to try out the PARALLEL_IO_CAP_ENABLED parameter. The first step was to use the new I/O calibration tool. I simply lifted the example straight out of the documentation and amended the number of available disks.
SQL> @io
SQL> SET SERVEROUTPUT ON
SQL> DECLARE
2 lat INTEGER;
3 iops INTEGER;
4 mbps INTEGER;
5 BEGIN
6 -- DBMS_RESOURCE_MANAGER.CALIBRATE_IO (<DISKS>, <MAX_LATENCY>, iops, mbps, lat);
7 DBMS_RESOURCE_MANAGER.CALIBRATE_IO (5, 10, iops, mbps, lat);
8
9 DBMS_OUTPUT.PUT_LINE ('max_iops = ' || iops);
10 DBMS_OUTPUT.PUT_LINE ('latency = ' || lat);
11 dbms_output.put_line('max_mbps = ' || mbps);
12 end;
13 /
max_iops = 112
latency = 8
max_mbps = 32
PL/SQL procedure successfully completed.Well, it's certainly recognised how pathetic my i/o infrastructure is, as I've mentioned in the past. Oh, and I should mention that this is a 5-disk stripe set with a stripe width of 256KB. That's too small, but was the one the Linux installation decided on and matched the one I used in the original paper, so I've left it as it is for now.
Now to see how this impacts the decision to use PX for a query. The problem is that the existing scripts use hints to request a specific degree of parallelism (DOP), e.g.
SELECT /*+ parallel(tt1,$DOP) parallel(tt2, $DOP) */ MOD(tt1.pk_id + tt2.pk_id, 113), COUNT(*)
FROM test_tab1 tt1, test_tab2 tt2
WHERE tt1.pk_id = tt2.pk_id
GROUP BY MOD(tt1.pk_id + tt2.pk_id ,113)
ORDER BY MOD(tt1.pk_id + tt2.pk_id ,113); So I threw together a slightly modified script that requests PX but doesn't specify the DOP in the hint, leaving it to Oracle to decide :-
SELECT /*+ parallel(tt1) parallel(tt2) */ MOD(tt1.pk_id + tt2.pk_id, 113), COUNT(*)
Without parallel_io_cap_enabled
SQL> show parameter parallel_io NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ parallel_io_cap_enabled boolean FALSE
Next I'll run the test and then check the contents of v$pq_tqstat to see if parallelism was used (note - this is only enough of the output to illustrate my point).
SQL> SELECT dfo_number, tq_id, server_type, process, num_rows, bytes
2 FROM v$pq_tqstat
3 ORDER BY dfo_number DESC, tq_id, server_type DESC , process;
DFO_NUMBER TQ_ID SERVER_TYP PROCESS NUM_ROWS BYTES
---------- ---------- ---------- ---------- ---------- ----------
1 0 Producer P000 8220476 58056397
Producer P001 8104871 57233905
Producer P002 8075353 57029517
Producer P003 7816421 55186002
Producer P004 8088829 57061516
Producer P005 8008482 56470221
Producer P006 8491092 59649083
Producer P007 8730476 61174367
Consumer P008 8189988 57718569
Consumer P009 8196864 57766896
Consumer P010 8192339 57734621
Consumer P011 8189512 57715446
Consumer P012 8192294 57734372
Consumer P013 8192216 57734131
Consumer P014 8193654 57745093
Consumer P015 8189133 57711880So the DOP is cpu_count (4) * parallel_threads_per_cpu (2) = 8 px slaves per slave set, with two slave sets used. That's what I expected.
With parallel_io_cap_enabled
Now I'll look at the effect of the new parameter.
SQL> alter system set PARALLEL_IO_CAP_ENABLED=true; System altered.
This time, the contents of v$pq_tqstat were
SQL> SELECT dfo_number, tq_id, server_type, process, num_rows, bytes 2 FROM v$pq_tqstat 3 ORDER BY dfo_number DESC, tq_id, server_type DESC , process; no rows selected
So, because of the limited i/o resources, Oracle's decided not to use PX for the same query, with the same hints. What I found interesting was that the execution plan still looks like PX is used. (Just a snippet of output again.)
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%
CPU)| Time | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------
---------------------------------------------
| 0 | SELECT STATEMENT | | 65M| 751M| | 585K
(13)| 03:22:10 | | | |
| 1 | PX COORDINATOR | | | | |
| | | | |
| 2 | PX SEND QC (ORDER) | :TQ10002 | 65M| 751M| | 585K
(13)| 03:22:10 | Q1,02 | P->S | QC (ORDER) |
| 3 | SORT GROUP BY | | 65M| 751M| 2513M| 585K
(13)| 03:22:10 | Q1,02 | PCWP | |
| 4 | PX RECEIVE | | 65M| 751M| | 285K
(11)| 01:38:33 | Q1,02 | PCWP | |
| 5 | PX SEND RANGE | :TQ10001 | 65M| 751M| | 285K
(11)| 01:38:33 | Q1,01 | P->P | RANGE |
|* 6 | HASH JOIN | | 65M| 751M| 1126M| 285K
(11)| 01:38:33 | Q1,01 | PCWP | |
| 7 | PX RECEIVE | | 65M| 375M| | 112K
(9)| 00:38:52 | Q1,01 | PCWP | |
| 8 | PX SEND BROADCAST | :TQ10000 | 65M| 375M| | 112K
(9)| 00:38:52 | Q1,00 | P->P | BROADCAST |
| 9 | PX BLOCK ITERATOR | | 65M| 375M| | 112K
(9)| 00:38:52 | Q1,00 | PCWC | |
| 10 | TABLE ACCESS FULL| TEST_TAB2 | 65M| 375M| | 112K
(9)| 00:38:52 | Q1,00 | PCWP | |
| 11 | PX BLOCK ITERATOR | | 65M| 375M| | 112K
(9)| 00:39:00 | Q1,01 | PCWC | |
| 12 | TABLE ACCESS FULL | TEST_TAB1 | 65M| 375M| | 112K
(9)| 00:39:00 | Q1,01 | PCWP | |There'll be more to say later, no doubt, but that's all for now.
#1 - William Robertson said:
2007-09-26 23:48 - (Reply)
Interesting - I have some code I've been playing around with for submitting arbitrary PL/SQL procedures in parallel using PQ and pipelined functions, which worked in 10g but now just serialises in 11g. I did wonder whether 11g can tell more about my system (a single-disk VM that's short of memory) and is rightly but inconveniently deciding that PQ is not a good idea.
#2 - Doug Burns said:
2007-09-27 21:08 - (Reply)
William,
parallel_io_cap_enabled is false by default, so unless you've set it to true, it probably isn't the specific cause of your serialisation.
Maybe it's a CBO change. There have been quite a few changes to how PQ is costed in recent versions which Jonathan Lewis talks about in his CBO book.
#3 - William Robertson said:
2007-09-28 07:27 - (Reply)
I don't recall any discussion of 11g in Jonathan Lewis' recent book, although I look forward to his next one.
#4 - Doug Burns said:
2007-09-28 07:29 - (Reply)
Sorry, I meant the discussions of how the costing has changed across 8/9/10. I'm glad there wasn't any discussion of 11, it would have meant he hadn't had time to look at it properly ![]()
#5 - Doug Burns said:
2007-09-28 07:30 - (Reply)
In my world, 9 and 10 are still recent, sadly!
#6 - Rajeshwaran, Jeyabal 2010-02-03 15:41 - (Reply)
Hi:
Thanks for demo. Can you please explain me how to read this Explain plan for Parallel queries?
#7 - Doug Burns said:
2010-02-07 10:09 - (Reply)
That could be a long reply, so ....
Try this

